Compare commits

7 Commits

Author SHA1 Message Date
Aodhan Collins
60eb89ea42 feat: character system v2 — schema upgrade, memory system, per-character TTS routing
Character schema v2: background, dialogue_style, appearance, skills, gaze_presets
with automatic v1→v2 migration. LLM-assisted character creation via Character MCP
server. Two-tier memory system (personal per-character + general shared) with
budget-based injection into LLM system prompt. Per-character TTS voice routing via
state file — Wyoming TTS server reads active config to route between Kokoro (local)
and ElevenLabs (cloud PCM 24kHz). Dashboard: memories page, conversation history,
character profile on cards, auto-TTS engine selection from character config.
Also includes VTube Studio expression bridge and ComfyUI API guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 19:15:46 +00:00
Aodhan Collins
1e52c002c2 feat: Raspberry Pi 5 kitchen satellite — Wyoming voice satellite with ReSpeaker pHAT
Add full Pi 5 satellite setup with ReSpeaker 2-Mics pHAT for kitchen
voice control via Wyoming protocol. Includes satellite_wrapper.py that
monkey-patches WakeStreamingSatellite to fix three compounding bugs:

- TTS echo suppression: mutes wake word detection while speaker plays
- Server writer race fix: checks _writer before streaming, re-arms on None
- Streaming timeout: auto-recovers after 30s if pipeline hangs
- Error recovery: resets streaming state on server Error events

Also includes Pi 5 hardware workarounds (wm8960 overlay, stereo-only
audio wrappers, ALSA mixer calibration) and deploy.sh with fast
iteration commands (--push-wrapper, --test-logs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 20:09:47 +00:00
Aodhan Collins
5f147cae61 Merge branch 'esp32': ESP32-S3-BOX-3 room satellite with voice pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 20:48:09 +00:00
Aodhan Collins
c4cecbd8dc feat: ESP32-S3-BOX-3 room satellite — ESPHome config, OTA deploy, placeholder faces
Living room unit fully working: on-device wake word (hey_jarvis), voice pipeline
via HA (Wyoming STT → OpenClaw → Wyoming TTS), static PNG display states, OTA
updates. Includes deploy.sh for quick OTA with custom image support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 20:48:03 +00:00
Aodhan Collins
3c0d905e64 feat: unified HomeAI dashboard — merge character + desktop into single app
Combines homeai-character (service status, character profiles, editor) and
homeai-desktop (chat with voice I/O) into homeai-dashboard on port 5173.

- 4-page sidebar layout: Dashboard, Chat, Characters, Editor
- Merged Vite middleware: health checks, service restart, bridge proxy
- Bridge upgraded to ThreadingHTTPServer (fixes LAN request queuing)
- TTS strips emojis before synthesis
- Updated start.sh with new launchd service names
- Added preload-models to startup sequence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:40:04 +00:00
Aodhan Collins
0c33de607f feat: add homeai-desktop web assistant with LAN access
Adds the desktop web assistant app (Vite + React) with OpenClaw bridge
proxy and exposes it on the local network (host: 0.0.0.0, port 5174).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:16:09 +00:00
Aodhan Collins
2d063c7db7 Merge branch 'voice-pipeline': MLX Whisper STT, Qwen3.5 MoE, HA tool calling fix 2026-03-13 18:03:16 +00:00
100 changed files with 14219 additions and 489 deletions

View File

@@ -9,6 +9,7 @@ OPENAI_API_KEY=
DEEPSEEK_API_KEY= DEEPSEEK_API_KEY=
GEMINI_API_KEY= GEMINI_API_KEY=
ELEVENLABS_API_KEY= ELEVENLABS_API_KEY=
GAZE_API_KEY=
# ─── Data & Paths ────────────────────────────────────────────────────────────── # ─── Data & Paths ──────────────────────────────────────────────────────────────
DATA_DIR=${HOME}/homeai-data DATA_DIR=${HOME}/homeai-data
@@ -40,10 +41,14 @@ OPEN_WEBUI_URL=http://localhost:3030
OLLAMA_PRIMARY_MODEL=llama3.3:70b OLLAMA_PRIMARY_MODEL=llama3.3:70b
OLLAMA_FAST_MODEL=qwen2.5:7b OLLAMA_FAST_MODEL=qwen2.5:7b
# Medium model kept warm for voice pipeline (override per persona)
# Used by preload-models.sh keep-warm daemon
HOMEAI_MEDIUM_MODEL=qwen3.5:35b-a3b
# ─── P3: Voice ───────────────────────────────────────────────────────────────── # ─── P3: Voice ─────────────────────────────────────────────────────────────────
WYOMING_STT_URL=tcp://localhost:10300 WYOMING_STT_URL=tcp://localhost:10300
WYOMING_TTS_URL=tcp://localhost:10301 WYOMING_TTS_URL=tcp://localhost:10301
ELEVENLABS_API_KEY= # Create at elevenlabs.io if using elevenlabs TTS engine # ELEVENLABS_API_KEY is set above in API Keys section
# ─── P4: Agent ───────────────────────────────────────────────────────────────── # ─── P4: Agent ─────────────────────────────────────────────────────────────────
OPENCLAW_URL=http://localhost:8080 OPENCLAW_URL=http://localhost:8080

View File

@@ -26,6 +26,7 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI & LLM ### AI & LLM
- **Ollama** — local LLM runtime (target models: Llama 3.3 70B, Qwen 2.5 72B) - **Ollama** — local LLM runtime (target models: Llama 3.3 70B, Qwen 2.5 72B)
- **Model keep-warm daemon** — `preload-models.sh` runs as a loop, checks every 5 min, re-pins evicted models with `keep_alive=-1`. Keeps `qwen2.5:7b` (small/fast) and `$HOMEAI_MEDIUM_MODEL` (default: `qwen3.5:35b-a3b`) always loaded in VRAM. Medium model is configurable via env var for per-persona model assignment.
- **Open WebUI** — browser-based chat interface, runs as Docker container - **Open WebUI** — browser-based chat interface, runs as Docker container
### Image Generation ### Image Generation
@@ -35,7 +36,8 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### Speech ### Speech
- **Whisper.cpp** — speech-to-text, optimised for Apple Silicon/Neural Engine - **Whisper.cpp** — speech-to-text, optimised for Apple Silicon/Neural Engine
- **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency) - **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency, local)
- **ElevenLabs TTS** — cloud voice cloning/synthesis (per-character voice ID, routed via state file)
- **Chatterbox TTS** — voice cloning engine (Apple Silicon MPS optimised) - **Chatterbox TTS** — voice cloning engine (Apple Silicon MPS optimised)
- **Qwen3-TTS** — alternative voice cloning via MLX - **Qwen3-TTS** — alternative voice cloning via MLX
- **openWakeWord** — always-on wake word detection - **openWakeWord** — always-on wake word detection
@@ -49,11 +51,13 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI Agent / Orchestration ### AI Agent / Orchestration
- **OpenClaw** — primary AI agent layer; receives voice commands, calls tools, manages personality - **OpenClaw** — primary AI agent layer; receives voice commands, calls tools, manages personality
- **n8n** — visual workflow automation (Docker), chains AI actions - **n8n** — visual workflow automation (Docker), chains AI actions
- **mem0** — long-term memory layer for the AI character - **Character Memory System** — two-tier JSON-based memories (personal per-character + general shared), injected into LLM system prompt with budget truncation
### Character & Personality ### Character & Personality
- **Character Manager** (built — see `character-manager.jsx`) — single config UI for personality, prompts, models, Live2D mappings, and notes - **Character Schema v2** — JSON spec with background, dialogue_style, appearance, skills, gaze_presets (v1 auto-migrated)
- Character config exports to JSON, consumed by OpenClaw system prompt and pipeline - **HomeAI Dashboard** — unified web app: character editor, chat, memory manager, service dashboard
- **Character MCP Server** — LLM-assisted character creation via Fandom wiki/Wikipedia lookup (Docker)
- Character config stored as JSON files in `~/homeai-data/characters/`, consumed by bridge for system prompt construction
### Visual Representation ### Visual Representation
- **VTube Studio** — Live2D model display on desktop (macOS) and mobile (iOS/Android) - **VTube Studio** — Live2D model display on desktop (macOS) and mobile (iOS/Android)
@@ -85,47 +89,79 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
ESP32-S3-BOX-3 (room) ESP32-S3-BOX-3 (room)
→ Wake word detected (openWakeWord, runs locally on device or Mac Mini) → Wake word detected (openWakeWord, runs locally on device or Mac Mini)
→ Audio streamed to Mac Mini via Wyoming Satellite → Audio streamed to Mac Mini via Wyoming Satellite
→ Whisper.cpp transcribes speech to text → Whisper MLX transcribes speech to text
OpenClaw receives text + context HA conversation agent → OpenClaw HTTP Bridge
Ollama LLM generates response (with character persona from system prompt) Bridge resolves character (satellite_id → character mapping)
mem0 updates long-term memory Bridge builds system prompt (profile + memories) and writes TTS config to state file
→ OpenClaw CLI → Ollama LLM generates response
→ Response dispatched: → Response dispatched:
Kokoro/Chatterbox renders TTS audio Wyoming TTS reads state file → routes to Kokoro (local) or ElevenLabs (cloud)
→ Audio sent back to ESP32-S3-BOX-3 (spoken response) → Audio sent back to ESP32-S3-BOX-3 (spoken response)
→ VTube Studio API triggered (expression + lip sync on desktop/mobile) → VTube Studio API triggered (expression + lip sync on desktop/mobile)
→ Home Assistant action called if applicable (lights, music, etc.) → Home Assistant action called if applicable (lights, music, etc.)
``` ```
### Timeout Strategy
The HTTP bridge checks Ollama `/api/ps` before each request to determine if the LLM is already loaded:
| Layer | Warm (model loaded) | Cold (model loading) |
|---|---|---|
| HA conversation component | 200s | 200s |
| OpenClaw HTTP bridge | 60s | 180s |
| OpenClaw agent | 60s | 60s |
The keep-warm daemon ensures models stay loaded, so cold starts should be rare (only after Ollama restarts or VRAM pressure).
--- ---
## Character System ## Character System
The AI assistant has a defined personality managed via the Character Manager tool. The AI assistant has a defined personality managed via the HomeAI Dashboard (character editor + memory manager).
Key config surfaces: ### Character Schema v2
- **System prompt** — injected into every Ollama request
- **Voice clone reference** — `.wav` file path for Chatterbox/Qwen3-TTS Each character is a JSON file in `~/homeai-data/characters/` with:
- **Live2D expression mappings** — idle, speaking, thinking, happy, error states - **System prompt** — core personality, injected into every LLM request
- **VTube Studio WebSocket triggers** — JSON map of events to expressions - **Profile fields** — background, appearance, dialogue_style, skills array
- **TTS config** — engine (kokoro/elevenlabs), kokoro_voice, elevenlabs_voice_id, elevenlabs_model, speed
- **GAZE presets** — array of `{preset, trigger}` for image generation styles
- **Custom prompt rules** — trigger/response overrides for specific contexts - **Custom prompt rules** — trigger/response overrides for specific contexts
- **mem0** — persistent memory that evolves over time
Character config JSON (exported from Character Manager) is the single source of truth consumed by all pipeline components. ### Memory System
Two-tier memory stored as JSON in `~/homeai-data/memories/`:
- **Personal memories** (`personal/{character_id}.json`) — per-character, about user interactions
- **General memories** (`general.json`) — shared operational knowledge (tool usage, device info, routines)
Memories are injected into the system prompt by the bridge with budget truncation (personal: 4000 chars, general: 3000 chars, newest first).
### TTS Voice Routing
The bridge writes the active character's TTS config to `~/homeai-data/active-tts-voice.json` before each request. The Wyoming TTS server reads this state file to determine which engine/voice to use:
- **Kokoro** — local, fast, uses `kokoro_voice` field (e.g., `af_heart`)
- **ElevenLabs** — cloud, uses `elevenlabs_voice_id` + `elevenlabs_model`, returns PCM 24kHz
This works for both ESP32/HA pipeline and dashboard chat.
--- ---
## Project Priorities ## Project Priorities
1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma) 1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma)
2. **LLM** — Ollama running with target models, Open WebUI connected 2. **LLM** — Ollama running with target models, Open WebUI connected
3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant 3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant
4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant 4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant
5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, LVGL face 5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, display faces ✅
6. **Character system** — system prompt wired up, mem0 integrated, voice cloned 6. **Character system** — schema v2, dashboard editor, memory system, per-character TTS routing ✅
7. **VTube Studio** — model loaded, WebSocket API bridge written as OpenClaw skill 7. **Animated visual** — PNG/GIF character visual for the web assistant (initial visual layer)
8. **ComfyUI** — image generation online, character-consistent model workflows 8. **Android app** — companion app for mobile access to the assistant
9. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server 9. **ComfyUI** — image generation online, character-consistent model workflows
10. **Polish** — Authelia, Tailscale hardening, mobile companion, iOS widgets 10. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server
11. **Polish** — Authelia, Tailscale hardening, iOS widgets
### Stretch Goals
- **Live2D / VTube Studio** — full Live2D model with WebSocket API bridge (requires learning Live2D tooling)
--- ---
@@ -133,7 +169,11 @@ Character config JSON (exported from Character Manager) is the single source of
- All Docker compose files: `~/server/docker/` - All Docker compose files: `~/server/docker/`
- OpenClaw skills: `~/.openclaw/skills/` - OpenClaw skills: `~/.openclaw/skills/`
- Character configs: `~/.openclaw/characters/` - Character configs: `~/homeai-data/characters/`
- Character memories: `~/homeai-data/memories/`
- Conversation history: `~/homeai-data/conversations/`
- Active TTS state: `~/homeai-data/active-tts-voice.json`
- Satellite → character map: `~/homeai-data/satellite-map.json`
- Whisper models: `~/models/whisper/` - Whisper models: `~/models/whisper/`
- Ollama models: managed by Ollama at `~/.ollama/models/` - Ollama models: managed by Ollama at `~/.ollama/models/`
- ComfyUI models: `~/ComfyUI/models/` - ComfyUI models: `~/ComfyUI/models/`

113
TODO.md
View File

@@ -26,7 +26,7 @@
- [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b - [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b
- [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b - [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b
- [x] Add qwen3.5:35b-a3b (MoE, Q8_0) — 26.7 tok/s, recommended for voice pipeline - [x] Add qwen3.5:35b-a3b (MoE, Q8_0) — 26.7 tok/s, recommended for voice pipeline
- [x] Write model preload script + launchd service (keeps voice model in VRAM permanently) - [x] Write model keep-warm daemon + launchd service (pins qwen2.5:7b + $HOMEAI_MEDIUM_MODEL in VRAM, checks every 5 min)
- [x] Deploy Open WebUI via Docker compose (port 3030) - [x] Deploy Open WebUI via Docker compose (port 3030)
- [x] Verify Open WebUI connected to Ollama, all models available - [x] Verify Open WebUI connected to Ollama, all models available
- [x] Run pipeline benchmark (homeai-voice/scripts/benchmark_pipeline.py) — STT/LLM/TTS latency profiled - [x] Run pipeline benchmark (homeai-voice/scripts/benchmark_pipeline.py) — STT/LLM/TTS latency profiled
@@ -82,7 +82,7 @@
- [x] Verify full voice → agent → HA action flow - [x] Verify full voice → agent → HA action flow
- [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required) - [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required)
### P5 · homeai-character *(can start alongside P4)* ### P5 · homeai-dashboard *(character system + dashboard)*
- [x] Define and write `schema/character.schema.json` (v1) - [x] Define and write `schema/character.schema.json` (v1)
- [x] Write `characters/aria.json` — default character - [x] Write `characters/aria.json` — default character
@@ -99,6 +99,16 @@
- [x] Build unified HomeAI dashboard — dark-themed frontend showing live service status + links to individual UIs - [x] Build unified HomeAI dashboard — dark-themed frontend showing live service status + links to individual UIs
- [x] Add character profile management to dashboard — store/switch character configs with attached profile images - [x] Add character profile management to dashboard — store/switch character configs with attached profile images
- [x] Add TTS voice preview in character editor — Kokoro preview via OpenClaw bridge with loading state, custom text, stop control - [x] Add TTS voice preview in character editor — Kokoro preview via OpenClaw bridge with loading state, custom text, stop control
- [x] Merge homeai-character + homeai-desktop into unified homeai-dashboard (services, chat, characters, editor)
- [x] Upgrade character schema to v2 — background, dialogue_style, appearance, skills, gaze_presets (auto-migrate v1)
- [x] Add LLM-assisted character creation via Character MCP server (Fandom/Wikipedia lookup)
- [x] Add character memory system — personal (per-character) + general (shared) memories with dashboard UI
- [x] Add conversation history with per-conversation persistence
- [x] Wire character_id through full pipeline (dashboard → bridge → LLM system prompt)
- [x] Add TTS text cleaning — strip tags, asterisks, emojis, markdown before synthesis
- [x] Add per-character TTS voice routing — bridge writes state file, Wyoming server reads it
- [x] Add ElevenLabs TTS support in Wyoming server — cloud voice synthesis via state file routing
- [x] Dashboard auto-selects character's TTS engine/voice (Kokoro or ElevenLabs)
- [ ] Deploy dashboard as Docker container or static site on Mac Mini - [ ] Deploy dashboard as Docker container or static site on Mac Mini
--- ---
@@ -107,63 +117,86 @@
### P6 · homeai-esp32 ### P6 · homeai-esp32
- [ ] Install ESPHome: `pip install esphome` - [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12 venv)
- [ ] Write `esphome/secrets.yaml` (gitignored) - [x] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml` - [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
- [ ] Write `s3-box-living-room.yaml` for first unit - [x] Generate placeholder face illustrations (7 PNGs, 320×240)
- [ ] Flash first unit via USB - [x] Write `setup.sh` with flash/ota/logs/validate commands
- [ ] Verify unit appears in HA device list - [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
- [ ] Assign Wyoming voice pipeline to unit in HA - [x] Flash first unit via USB (living room)
- [ ] Test full wake → STT → LLM → TTS → audio playback cycle - [x] Verify unit appears in HA device list (requires HA 2026.x for ESPHome 2025.12+ compat)
- [ ] Test LVGL face: idle → listening → thinking → speaking → error - [x] Assign Wyoming voice pipeline to unit in HA
- [ ] Verify OTA firmware update works wirelessly - [x] Test full wake → STT → LLM → TTS → audio playback cycle
- [ ] Flash remaining units (bedroom, kitchen, etc.) - [x] Test display states: idle → listening → thinking → replying → error
- [x] Verify OTA firmware update works wirelessly (`deploy.sh --device OTA`)
- [ ] Flash remaining units (bedroom, kitchen)
- [ ] Document MAC address → room name mapping - [ ] Document MAC address → room name mapping
### P6b · homeai-rpi (Kitchen Satellite)
- [x] Set up Wyoming Satellite on Raspberry Pi 5 (SELBINA) with ReSpeaker 2-Mics pHAT
- [x] Write setup.sh — full Pi provisioning (venv, drivers, systemd, scripts)
- [x] Write deploy.sh — remote deploy/manage from Mac Mini (push-wrapper, test-logs, etc.)
- [x] Write satellite_wrapper.py — monkey-patches fixing TTS echo, writer race, streaming timeout
- [x] Test multi-command voice loop without freezing
--- ---
## Phase 5 — Visual Layer ## Phase 5 — Visual Layer
### P7 · homeai-visual ### P7 · homeai-visual
- [ ] Install VTube Studio (Mac App Store) #### VTube Studio Expression Bridge
- [ ] Enable WebSocket API on port 8001 - [x] Write `vtube-bridge.py` — persistent WebSocket ↔ HTTP bridge daemon (port 8002)
- [ ] Source/purchase a Live2D model (nizima.com or booth.pm) - [x] Write `vtube-ctl` CLI wrapper + OpenClaw skill (`~/.openclaw/skills/vtube-studio/`)
- [ ] Load model in VTube Studio - [x] Wire expression triggers into `openclaw-http-bridge.py` (thinking → idle, speaking → idle)
- [ ] Create hotkeys for all 8 expression states - [x] Add amplitude-based lip sync to `wyoming_kokoro_server.py` (RMS → MouthOpen parameter)
- [ ] Write `skills/vtube_studio` SKILL.md + implementation - [x] Write `test-expressions.py` — auth flow, expression cycle, lip sync sweep, latency test
- [ ] Run auth flow — click Allow in VTube Studio, save token - [x] Write launchd plist + setup.sh for venv creation and service registration
- [ ] Test all 8 expressions via test script - [ ] Install VTube Studio from Mac App Store, enable WebSocket API (port 8001)
- [ ] Update `aria.json` with real VTube Studio hotkey IDs - [ ] Source/purchase Live2D model, load in VTube Studio
- [ ] Write `lipsync.py` amplitude-based helper - [ ] Create 8 expression hotkeys, record UUIDs
- [ ] Integrate lip sync into OpenClaw TTS dispatch - [ ] Run `setup.sh` to create venv, install websockets, load launchd service
- [ ] Test full pipeline: voice → thinking expression → speaking with lip sync - [ ] Run `vtube-ctl auth` — click Allow in VTube Studio
- [ ] Update `aria.json` with real hotkey UUIDs (replace placeholders)
- [ ] Run `test-expressions.py --all` — verify expressions + lip sync + latency
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet - [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
#### Web Visuals (Dashboard)
- [ ] Design PNG/GIF character visuals for web assistant (idle, thinking, speaking, etc.)
- [ ] Integrate animated visuals into homeai-dashboard chat view
- [ ] Sync visual state to voice pipeline events (listening, processing, responding)
- [ ] Add expression transitions and idle animations
### P8 · homeai-android
- [ ] Build Android companion app for mobile assistant access
- [ ] Integrate with OpenClaw bridge API (chat, TTS, STT)
- [ ] Add character visual display
- [ ] Push notification support via ntfy/FCM
--- ---
## Phase 6 — Image Generation ## Phase 6 — Image Generation
### P8 · homeai-images ### P9 · homeai-images (ComfyUI)
- [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv - [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv
- [ ] Verify MPS is detected at launch - [ ] Verify MPS is detected at launch
- [ ] Write and load launchd plist (`com.homeai.comfyui.plist`) - [ ] Write and load launchd plist (`com.homeai.comfyui.plist`)
- [ ] Download SDXL base model - [ ] Download SDXL base model + Flux.1-schnell + ControlNet models
- [ ] Download Flux.1-schnell
- [ ] Download ControlNet models (canny, depth)
- [ ] Test generation via ComfyUI web UI (port 8188) - [ ] Test generation via ComfyUI web UI (port 8188)
- [ ] Build and export `quick.json`, `portrait.json`, `scene.json`, `upscale.json` workflows - [ ] Build and export workflow JSONs (quick, portrait, scene, upscale)
- [ ] Write `skills/comfyui` SKILL.md + implementation - [ ] Write `skills/comfyui` SKILL.md + implementation
- [ ] Test skill: "Generate a portrait of Aria looking happy"
- [ ] Collect character reference images for LoRA training - [ ] Collect character reference images for LoRA training
- [ ] Train SDXL LoRA with kohya_ss, verify character consistency
- [ ] Add ComfyUI to Uptime Kuma monitors - [ ] Add ComfyUI to Uptime Kuma monitors
--- ---
## Phase 7 — Extended Integrations & Polish ## Phase 7 — Extended Integrations & Polish
### P10 · Integrations & Polish
- [ ] Deploy Music Assistant (Docker), integrate with Home Assistant - [ ] Deploy Music Assistant (Docker), integrate with Home Assistant
- [ ] Write `skills/music` SKILL.md for OpenClaw - [ ] Write `skills/music` SKILL.md for OpenClaw
- [ ] Deploy Snapcast server on Mac Mini - [ ] Deploy Snapcast server on Mac Mini
@@ -180,10 +213,24 @@
--- ---
## Stretch Goals
### Live2D / VTube Studio
- [ ] Learn Live2D modelling toolchain (Live2D Cubism Editor)
- [ ] Install VTube Studio (Mac App Store), enable WebSocket API on port 8001
- [ ] Source/commission a Live2D model (nizima.com or booth.pm)
- [ ] Create hotkeys for expression states
- [ ] Write `skills/vtube_studio` SKILL.md + implementation
- [ ] Write `lipsync.py` amplitude-based helper
- [ ] Integrate lip sync into OpenClaw TTS dispatch
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
---
## Open Decisions ## Open Decisions
- [ ] Confirm character name (determines wake word training) - [ ] Confirm character name (determines wake word training)
- [ ] Live2D model: purchase off-the-shelf or commission custom?
- [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)? - [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
- [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room? - [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
- [ ] Authelia user store: local file vs LDAP? - [ ] Authelia user store: local file vs LDAP?

View File

@@ -12,7 +12,7 @@ CONF_TIMEOUT = "timeout"
DEFAULT_HOST = "10.0.0.101" DEFAULT_HOST = "10.0.0.101"
DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway) DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway)
DEFAULT_AGENT = "main" DEFAULT_AGENT = "main"
DEFAULT_TIMEOUT = 120 DEFAULT_TIMEOUT = 200 # Must exceed bridge cold timeout (180s)
# API endpoints # API endpoints
OPENCLAW_API_PATH = "/api/agent/message" OPENCLAW_API_PATH = "/api/agent/message"

View File

@@ -77,12 +77,16 @@ class OpenClawAgent(AbstractConversationAgent):
_LOGGER.debug("Processing message: %s", text) _LOGGER.debug("Processing message: %s", text)
try: try:
response_text = await self._call_openclaw(text) response_text = await self._call_openclaw(
text,
satellite_id=getattr(user_input, "satellite_id", None),
device_id=getattr(user_input, "device_id", None),
)
# Create proper IntentResponse for Home Assistant # Create proper IntentResponse for Home Assistant
intent_response = IntentResponse(language=user_input.language or "en") intent_response = IntentResponse(language=user_input.language or "en")
intent_response.async_set_speech(response_text) intent_response.async_set_speech(response_text)
return ConversationResult( return ConversationResult(
response=intent_response, response=intent_response,
conversation_id=conversation_id, conversation_id=conversation_id,
@@ -96,13 +100,14 @@ class OpenClawAgent(AbstractConversationAgent):
conversation_id=conversation_id, conversation_id=conversation_id,
) )
async def _call_openclaw(self, message: str) -> str: async def _call_openclaw(self, message: str, satellite_id: str = None, device_id: str = None) -> str:
"""Call OpenClaw API and return the response.""" """Call OpenClaw API and return the response."""
url = f"http://{self.host}:{self.port}{OPENCLAW_API_PATH}" url = f"http://{self.host}:{self.port}{OPENCLAW_API_PATH}"
payload = { payload = {
"message": message, "message": message,
"agent": self.agent_name, "agent": self.agent_name,
"satellite_id": satellite_id or device_id,
} }
session = async_get_clientsession(self.hass) session = async_get_clientsession(self.hass)

View File

@@ -35,6 +35,8 @@
<dict> <dict>
<key>PATH</key> <key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string> <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict> </dict>
</dict> </dict>
</plist> </plist>

View File

@@ -28,6 +28,8 @@
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string> <string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>HASS_TOKEN</key> <key>HASS_TOKEN</key>
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string> <string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict> </dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>

View File

@@ -24,33 +24,241 @@ Endpoints:
import argparse import argparse
import json import json
import os
import subprocess import subprocess
import sys import sys
import asyncio import asyncio
import urllib.request
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler from http.server import HTTPServer, BaseHTTPRequestHandler
from socketserver import ThreadingMixIn
from urllib.parse import urlparse from urllib.parse import urlparse
from pathlib import Path from pathlib import Path
import wave import wave
import io import io
import re
from wyoming.client import AsyncTcpClient from wyoming.client import AsyncTcpClient
from wyoming.tts import Synthesize, SynthesizeVoice from wyoming.tts import Synthesize, SynthesizeVoice
from wyoming.asr import Transcribe, Transcript from wyoming.asr import Transcribe, Transcript
from wyoming.audio import AudioStart, AudioChunk, AudioStop from wyoming.audio import AudioStart, AudioChunk, AudioStop
from wyoming.info import Info from wyoming.info import Info
# Timeout settings (seconds)
TIMEOUT_WARM = 120 # Model already loaded in VRAM
TIMEOUT_COLD = 180 # Model needs loading first (~10-20s load + inference)
OLLAMA_PS_URL = "http://localhost:11434/api/ps"
VTUBE_BRIDGE_URL = "http://localhost:8002"
def load_character_prompt() -> str:
"""Load the active character system prompt.""" def _vtube_fire_and_forget(path: str, data: dict):
character_path = Path.home() / ".openclaw" / "characters" / "aria.json" """Send a non-blocking POST to the VTube Studio bridge. Failures are silent."""
def _post():
try:
body = json.dumps(data).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=2)
except Exception:
pass # bridge may not be running — that's fine
threading.Thread(target=_post, daemon=True).start()
def is_model_warm() -> bool:
"""Check if the default Ollama model is already loaded in VRAM."""
try:
req = urllib.request.Request(OLLAMA_PS_URL)
with urllib.request.urlopen(req, timeout=2) as resp:
data = json.loads(resp.read())
return len(data.get("models", [])) > 0
except Exception:
# If we can't reach Ollama, assume cold (safer longer timeout)
return False
CHARACTERS_DIR = Path("/Users/aodhan/homeai-data/characters")
SATELLITE_MAP_PATH = Path("/Users/aodhan/homeai-data/satellite-map.json")
MEMORIES_DIR = Path("/Users/aodhan/homeai-data/memories")
ACTIVE_TTS_VOICE_PATH = Path("/Users/aodhan/homeai-data/active-tts-voice.json")
def clean_text_for_tts(text: str) -> str:
"""Strip content that shouldn't be spoken: tags, asterisks, emojis, markdown."""
# Remove HTML/XML tags and their content for common non-spoken tags
text = re.sub(r'<[^>]+>', '', text)
# Remove content between asterisks (actions/emphasis markup like *sighs*)
text = re.sub(r'\*[^*]+\*', '', text)
# Remove markdown bold/italic markers that might remain
text = re.sub(r'[*_]{1,3}', '', text)
# Remove markdown headers
text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE)
# Remove markdown links [text](url) → keep text
text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', text)
# Remove bare URLs
text = re.sub(r'https?://\S+', '', text)
# Remove code blocks and inline code
text = re.sub(r'```[\s\S]*?```', '', text)
text = re.sub(r'`[^`]+`', '', text)
# Remove emojis
text = re.sub(
r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF'
r'\U0001F1E0-\U0001F1FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FAFF'
r'\U00002702-\U000027B0\U0000FE00-\U0000FE0F\U0000200D'
r'\U00002600-\U000026FF\U00002300-\U000023FF]+', '', text
)
# Collapse multiple spaces/newlines
text = re.sub(r'\n{2,}', '\n', text)
text = re.sub(r'[ \t]{2,}', ' ', text)
return text.strip()
def load_satellite_map() -> dict:
"""Load the satellite-to-character mapping."""
try:
with open(SATELLITE_MAP_PATH) as f:
return json.load(f)
except Exception:
return {"default": "aria_default", "satellites": {}}
def set_active_tts_voice(character_id: str, tts_config: dict):
"""Write the active TTS config to a state file for the Wyoming TTS server to read."""
try:
ACTIVE_TTS_VOICE_PATH.parent.mkdir(parents=True, exist_ok=True)
state = {
"character_id": character_id,
"engine": tts_config.get("engine", "kokoro"),
"kokoro_voice": tts_config.get("kokoro_voice", ""),
"elevenlabs_voice_id": tts_config.get("elevenlabs_voice_id", ""),
"elevenlabs_model": tts_config.get("elevenlabs_model", "eleven_multilingual_v2"),
"speed": tts_config.get("speed", 1),
}
with open(ACTIVE_TTS_VOICE_PATH, "w") as f:
json.dump(state, f)
except Exception as e:
print(f"[OpenClaw Bridge] Warning: could not write active TTS config: {e}")
def resolve_character_id(satellite_id: str = None) -> str:
"""Resolve a satellite ID to a character profile ID."""
sat_map = load_satellite_map()
if satellite_id and satellite_id in sat_map.get("satellites", {}):
return sat_map["satellites"][satellite_id]
return sat_map.get("default", "aria_default")
def load_character(character_id: str = None) -> dict:
"""Load a character profile by ID. Returns the full character data dict."""
if not character_id:
character_id = resolve_character_id()
safe_id = character_id.replace("/", "_")
character_path = CHARACTERS_DIR / f"{safe_id}.json"
if not character_path.exists(): if not character_path.exists():
return "" return {}
try: try:
with open(character_path) as f: with open(character_path) as f:
data = json.load(f) profile = json.load(f)
return data.get("system_prompt", "") return profile.get("data", {})
except Exception: except Exception:
return {}
def load_character_prompt(satellite_id: str = None, character_id: str = None) -> str:
"""Load the full system prompt for a character, resolved by satellite or explicit ID.
Builds a rich prompt from system_prompt + profile fields (background, dialogue_style, etc.)."""
if not character_id:
character_id = resolve_character_id(satellite_id)
char = load_character(character_id)
if not char:
return "" return ""
sections = []
# Core system prompt
prompt = char.get("system_prompt", "")
if prompt:
sections.append(prompt)
# Character profile fields
profile_parts = []
if char.get("background"):
profile_parts.append(f"## Background\n{char['background']}")
if char.get("appearance"):
profile_parts.append(f"## Appearance\n{char['appearance']}")
if char.get("dialogue_style"):
profile_parts.append(f"## Dialogue Style\n{char['dialogue_style']}")
if char.get("skills"):
skills = char["skills"]
if isinstance(skills, list):
skills_text = ", ".join(skills[:15])
else:
skills_text = str(skills)
profile_parts.append(f"## Skills & Interests\n{skills_text}")
if profile_parts:
sections.append("[Character Profile]\n" + "\n\n".join(profile_parts))
# Character metadata
meta_lines = []
if char.get("display_name"):
meta_lines.append(f"Your name is: {char['display_name']}")
# Support both v1 (gaze_preset string) and v2 (gaze_presets array)
gaze_presets = char.get("gaze_presets", [])
if gaze_presets and isinstance(gaze_presets, list):
for gp in gaze_presets:
preset = gp.get("preset", "")
trigger = gp.get("trigger", "self-portrait")
if preset:
meta_lines.append(f"GAZE preset '{preset}' — use for: {trigger}")
elif char.get("gaze_preset"):
meta_lines.append(f"Your gaze_preset for self-portraits is: {char['gaze_preset']}")
if meta_lines:
sections.append("[Character Metadata]\n" + "\n".join(meta_lines))
# Memories (personal + general)
personal, general = load_memories(character_id)
if personal:
sections.append("[Personal Memories]\n" + "\n".join(f"- {m}" for m in personal))
if general:
sections.append("[General Knowledge]\n" + "\n".join(f"- {m}" for m in general))
return "\n\n".join(sections)
def load_memories(character_id: str) -> tuple[list[str], list[str]]:
"""Load personal (per-character) and general memories.
Returns (personal_contents, general_contents) truncated to fit context budget."""
PERSONAL_BUDGET = 4000 # max chars for personal memories in prompt
GENERAL_BUDGET = 3000 # max chars for general memories in prompt
def _read_memories(path: Path, budget: int) -> list[str]:
try:
with open(path) as f:
data = json.load(f)
except Exception:
return []
memories = data.get("memories", [])
# Sort newest first
memories.sort(key=lambda m: m.get("createdAt", ""), reverse=True)
result = []
used = 0
for m in memories:
content = m.get("content", "").strip()
if not content:
continue
if used + len(content) > budget:
break
result.append(content)
used += len(content)
return result
safe_id = character_id.replace("/", "_")
personal = _read_memories(MEMORIES_DIR / "personal" / f"{safe_id}.json", PERSONAL_BUDGET)
general = _read_memories(MEMORIES_DIR / "general.json", GENERAL_BUDGET)
return personal, general
class OpenClawBridgeHandler(BaseHTTPRequestHandler): class OpenClawBridgeHandler(BaseHTTPRequestHandler):
"""HTTP request handler for OpenClaw bridge.""" """HTTP request handler for OpenClaw bridge."""
@@ -93,37 +301,78 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
self._send_json_response(404, {"error": "Not found"}) self._send_json_response(404, {"error": "Not found"})
def _handle_tts_request(self): def _handle_tts_request(self):
"""Handle TTS request and return wav audio.""" """Handle TTS request and return audio. Routes to Kokoro or ElevenLabs based on engine."""
content_length = int(self.headers.get("Content-Length", 0)) content_length = int(self.headers.get("Content-Length", 0))
if content_length == 0: if content_length == 0:
self._send_json_response(400, {"error": "Empty body"}) self._send_json_response(400, {"error": "Empty body"})
return return
try: try:
body = self.rfile.read(content_length).decode() body = self.rfile.read(content_length).decode()
data = json.loads(body) data = json.loads(body)
except json.JSONDecodeError: except json.JSONDecodeError:
self._send_json_response(400, {"error": "Invalid JSON"}) self._send_json_response(400, {"error": "Invalid JSON"})
return return
text = data.get("text", "Hello, this is a test.") text = data.get("text", "Hello, this is a test.")
text = clean_text_for_tts(text)
voice = data.get("voice", "af_heart") voice = data.get("voice", "af_heart")
engine = data.get("engine", "kokoro")
try: try:
# Run the async Wyoming client # Signal avatar: speaking
audio_bytes = asyncio.run(self._synthesize_audio(text, voice)) _vtube_fire_and_forget("/expression", {"event": "speaking"})
# Send WAV response if engine == "elevenlabs":
audio_bytes, content_type = self._synthesize_elevenlabs(text, voice, data.get("model"))
else:
# Default: local Kokoro via Wyoming
audio_bytes = asyncio.run(self._synthesize_audio(text, voice))
content_type = "audio/wav"
# Signal avatar: idle
_vtube_fire_and_forget("/expression", {"event": "idle"})
self.send_response(200) self.send_response(200)
self.send_header("Content-Type", "audio/wav") self.send_header("Content-Type", content_type)
# Allow CORS for local testing from Vite
self.send_header("Access-Control-Allow-Origin", "*") self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers() self.end_headers()
self.wfile.write(audio_bytes) self.wfile.write(audio_bytes)
except Exception as e: except Exception as e:
_vtube_fire_and_forget("/expression", {"event": "error"})
self._send_json_response(500, {"error": str(e)}) self._send_json_response(500, {"error": str(e)})
def _synthesize_elevenlabs(self, text: str, voice_id: str, model: str = None) -> tuple[bytes, str]:
"""Call ElevenLabs TTS API and return (audio_bytes, content_type)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set in environment")
if not voice_id:
raise RuntimeError("No ElevenLabs voice ID provided")
model = model or "eleven_multilingual_v2"
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
"Accept": "audio/mpeg",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
audio_bytes = resp.read()
return audio_bytes, "audio/mpeg"
def do_OPTIONS(self): def do_OPTIONS(self):
"""Handle CORS preflight requests.""" """Handle CORS preflight requests."""
self.send_response(204) self.send_response(204)
@@ -255,6 +504,43 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}") print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}")
self._send_json_response(200, {"status": "ok", "message": "Wake word received"}) self._send_json_response(200, {"status": "ok", "message": "Wake word received"})
@staticmethod
def _call_openclaw(message: str, agent: str, timeout: int) -> str:
"""Call OpenClaw CLI and return stdout."""
result = subprocess.run(
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True,
text=True,
timeout=timeout,
check=True,
)
return result.stdout.strip()
@staticmethod
def _needs_followup(response: str) -> bool:
"""Detect if the model promised to act but didn't actually do it.
Returns True if the response looks like a 'will do' without a result."""
if not response:
return False
resp_lower = response.lower()
# If the response contains a URL or JSON-like output, it probably completed
if "http://" in response or "https://" in response or '"status"' in response:
return False
# If it contains a tool result indicator (ha-ctl output, gaze-ctl output)
if any(kw in resp_lower for kw in ["image_url", "seed", "entity_id", "state:", "turned on", "turned off"]):
return False
# Detect promise-like language without substance
promise_phrases = [
"let me", "i'll ", "i will ", "sure thing", "sure,", "right away",
"generating", "one moment", "working on", "hang on", "just a moment",
"on it", "let me generate", "let me create",
]
has_promise = any(phrase in resp_lower for phrase in promise_phrases)
# Short responses with promise language are likely incomplete
if has_promise and len(response) < 200:
return True
return False
def _handle_agent_request(self): def _handle_agent_request(self):
"""Handle agent message request.""" """Handle agent message request."""
content_length = int(self.headers.get("Content-Length", 0)) content_length = int(self.headers.get("Content-Length", 0))
@@ -271,29 +557,63 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
message = data.get("message") message = data.get("message")
agent = data.get("agent", "main") agent = data.get("agent", "main")
satellite_id = data.get("satellite_id")
explicit_character_id = data.get("character_id")
if not message: if not message:
self._send_json_response(400, {"error": "Message is required"}) self._send_json_response(400, {"error": "Message is required"})
return return
# Inject system prompt # Resolve character: explicit ID > satellite mapping > default
system_prompt = load_character_prompt() if explicit_character_id:
character_id = explicit_character_id
else:
character_id = resolve_character_id(satellite_id)
system_prompt = load_character_prompt(character_id=character_id)
# Set the active TTS config for the Wyoming server to pick up
char = load_character(character_id)
tts_config = char.get("tts", {})
if tts_config:
set_active_tts_voice(character_id, tts_config)
engine = tts_config.get("engine", "kokoro")
voice_label = tts_config.get("kokoro_voice", "") if engine == "kokoro" else tts_config.get("elevenlabs_voice_id", "")
print(f"[OpenClaw Bridge] Active TTS: {engine} / {voice_label}")
if satellite_id:
print(f"[OpenClaw Bridge] Satellite: {satellite_id} → character: {character_id}")
elif explicit_character_id:
print(f"[OpenClaw Bridge] Character: {character_id}")
if system_prompt: if system_prompt:
message = f"System Context: {system_prompt}\n\nUser Request: {message}" message = f"System Context: {system_prompt}\n\nUser Request: {message}"
# Check if model is warm to set appropriate timeout
warm = is_model_warm()
timeout = TIMEOUT_WARM if warm else TIMEOUT_COLD
print(f"[OpenClaw Bridge] Model {'warm' if warm else 'cold'}, timeout={timeout}s")
# Signal avatar: thinking
_vtube_fire_and_forget("/expression", {"event": "thinking"})
# Call OpenClaw CLI (use full path for launchd compatibility) # Call OpenClaw CLI (use full path for launchd compatibility)
try: try:
result = subprocess.run( response_text = self._call_openclaw(message, agent, timeout)
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True, # Re-prompt if the model promised to act but didn't call a tool.
text=True, # Detect "I'll do X" / "Let me X" responses that lack any result.
timeout=120, if self._needs_followup(response_text):
check=True print(f"[OpenClaw Bridge] Response looks like a promise without action, re-prompting")
) followup = (
response_text = result.stdout.strip() "You just said you would do something but didn't actually call the exec tool. "
"Do NOT explain what you will do — call the tool NOW using exec and return the result."
)
response_text = self._call_openclaw(followup, agent, timeout)
# Signal avatar: idle (TTS handler will override to 'speaking' if voice is used)
_vtube_fire_and_forget("/expression", {"event": "idle"})
self._send_json_response(200, {"response": response_text}) self._send_json_response(200, {"response": response_text})
except subprocess.TimeoutExpired: except subprocess.TimeoutExpired:
self._send_json_response(504, {"error": "OpenClaw command timed out"}) self._send_json_response(504, {"error": f"OpenClaw command timed out after {timeout}s (model was {'warm' if warm else 'cold'})"})
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
error_msg = e.stderr.strip() if e.stderr else "OpenClaw command failed" error_msg = e.stderr.strip() if e.stderr else "OpenClaw command failed"
self._send_json_response(500, {"error": error_msg}) self._send_json_response(500, {"error": error_msg})
@@ -316,6 +636,10 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
self._send_json_response(404, {"error": "Not found"}) self._send_json_response(404, {"error": "Not found"})
class ThreadingHTTPServer(ThreadingMixIn, HTTPServer):
daemon_threads = True
def main(): def main():
"""Run the HTTP bridge server.""" """Run the HTTP bridge server."""
parser = argparse.ArgumentParser(description="OpenClaw HTTP Bridge") parser = argparse.ArgumentParser(description="OpenClaw HTTP Bridge")
@@ -332,8 +656,8 @@ def main():
) )
args = parser.parse_args() args = parser.parse_args()
HTTPServer.allow_reuse_address = True ThreadingHTTPServer.allow_reuse_address = True
server = HTTPServer((args.host, args.port), OpenClawBridgeHandler) server = ThreadingHTTPServer((args.host, args.port), OpenClawBridgeHandler)
print(f"OpenClaw HTTP Bridge running on http://{args.host}:{args.port}") print(f"OpenClaw HTTP Bridge running on http://{args.host}:{args.port}")
print(f"Endpoint: POST http://{args.host}:{args.port}/api/agent/message") print(f"Endpoint: POST http://{args.host}:{args.port}/api/agent/message")
print("Press Ctrl+C to stop") print("Press Ctrl+C to stop")

View File

@@ -0,0 +1,38 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.character-dashboard</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/npx</string>
<string>vite</string>
<string>--host</string>
<string>--port</string>
<string>5173</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/aodhan/gitea/homeai/homeai-character</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
<key>HOME</key>
<string>/Users/aodhan</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-character-dashboard.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-character-dashboard-error.log</string>
</dict>
</plist>

2
homeai-dashboard/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
node_modules/
dist/

View File

@@ -0,0 +1,49 @@
{
"schema_version": 1,
"name": "aria",
"display_name": "Aria",
"description": "Default HomeAI assistant persona",
"system_prompt": "You are Aria, a warm, curious, and helpful AI assistant living in the home. You speak naturally and conversationally — never robotic. You are knowledgeable but never condescending. You remember the people you live with and build on those memories over time. Keep responses concise when controlling smart home devices; be more expressive in casual conversation. Never break character.",
"model_overrides": {
"primary": "llama3.3:70b",
"fast": "qwen2.5:7b"
},
"tts": {
"engine": "chatterbox",
"voice_ref_path": "~/voices/aria-raw.wav",
"kokoro_voice": "af_heart",
"speed": 1.0
},
"live2d_expressions": {
"idle": "expr_idle",
"listening": "expr_listening",
"thinking": "expr_thinking",
"speaking": "expr_speaking",
"happy": "expr_happy",
"sad": "expr_sad",
"surprised": "expr_surprised",
"error": "expr_error"
},
"vtube_ws_triggers": {
"thinking": {
"type": "hotkey",
"id": "expr_thinking"
},
"speaking": {
"type": "hotkey",
"id": "expr_speaking"
},
"idle": {
"type": "hotkey",
"id": "expr_idle"
}
},
"custom_rules": [
{
"trigger": "good morning",
"response": "Good morning! How did you sleep?",
"condition": "time_of_day == morning"
}
],
"notes": "Default persona. Voice clone to be added once reference audio recorded."
}

View File

@@ -0,0 +1,15 @@
<!doctype html>
<html lang="en" class="dark">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/icon.svg" />
<link rel="manifest" href="/manifest.json" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="theme-color" content="#030712" />
<title>HomeAI Dashboard</title>
</head>
<body class="bg-gray-950 text-gray-100">
<div id="root"></div>
<script type="module" src="/src/main.jsx"></script>
</body>
</html>

View File

@@ -0,0 +1,43 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.dashboard</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/npx</string>
<string>vite</string>
<string>--host</string>
<string>--port</string>
<string>5173</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/aodhan/gitea/homeai/homeai-dashboard</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
<key>HOME</key>
<string>/Users/aodhan</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-dashboard.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-dashboard-error.log</string>
</dict>
</plist>

2229
homeai-dashboard/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,26 @@
{
"name": "homeai-dashboard",
"private": true,
"version": "0.1.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"dependencies": {
"@tailwindcss/vite": "^4.2.1",
"ajv": "^8.18.0",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"react-router-dom": "^7.13.1",
"tailwindcss": "^4.2.1"
},
"devDependencies": {
"@vitejs/plugin-react": "^5.1.1",
"vite": "^8.0.0-beta.13"
},
"overrides": {
"vite": "^8.0.0-beta.13"
}
}

View File

@@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64">
<rect width="64" height="64" rx="14" fill="#030712"/>
<circle cx="32" cy="28" r="12" fill="none" stroke="#818cf8" stroke-width="2.5"/>
<path d="M26 26c0-3.3 2.7-6 6-6s6 2.7 6 6" fill="none" stroke="#818cf8" stroke-width="2" stroke-linecap="round"/>
<rect x="30" y="40" width="4" height="8" rx="2" fill="#818cf8"/>
<path d="M24 52h16" stroke="#818cf8" stroke-width="2.5" stroke-linecap="round"/>
<circle cx="29" cy="27" r="1.5" fill="#34d399"/>
<circle cx="35" cy="27" r="1.5" fill="#34d399"/>
</svg>

After

Width:  |  Height:  |  Size: 575 B

View File

@@ -0,0 +1,16 @@
{
"name": "HomeAI Dashboard",
"short_name": "HomeAI",
"description": "HomeAI dashboard — services, chat, and character management",
"start_url": "/",
"display": "standalone",
"background_color": "#030712",
"theme_color": "#030712",
"icons": [
{
"src": "/icon.svg",
"sizes": "any",
"type": "image/svg+xml"
}
]
}

View File

@@ -0,0 +1,78 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "HomeAI Character Config",
"version": "2",
"type": "object",
"required": ["schema_version", "name", "system_prompt", "tts"],
"properties": {
"schema_version": { "type": "integer", "enum": [1, 2] },
"name": { "type": "string" },
"display_name": { "type": "string" },
"description": { "type": "string" },
"background": { "type": "string", "description": "Backstory, lore, or general prompt enrichment" },
"dialogue_style": { "type": "string", "description": "How the persona speaks or reacts, with example lines" },
"appearance": { "type": "string", "description": "Physical description, also used for image prompting" },
"skills": {
"type": "array",
"description": "Topics the persona specialises in or enjoys talking about",
"items": { "type": "string" }
},
"system_prompt": { "type": "string" },
"model_overrides": {
"type": "object",
"properties": {
"primary": { "type": "string" },
"fast": { "type": "string" }
}
},
"tts": {
"type": "object",
"required": ["engine"],
"properties": {
"engine": {
"type": "string",
"enum": ["kokoro", "chatterbox", "qwen3", "elevenlabs"]
},
"voice_ref_path": { "type": "string" },
"kokoro_voice": { "type": "string" },
"elevenlabs_voice_id": { "type": "string" },
"elevenlabs_voice_name": { "type": "string" },
"elevenlabs_model": { "type": "string", "default": "eleven_monolingual_v1" },
"speed": { "type": "number", "default": 1.0 }
}
},
"gaze_presets": {
"type": "array",
"description": "GAZE image generation presets with trigger conditions",
"items": {
"type": "object",
"required": ["preset"],
"properties": {
"preset": { "type": "string" },
"trigger": { "type": "string", "default": "self-portrait" }
}
}
},
"custom_rules": {
"type": "array",
"description": "Trigger/response overrides for specific contexts",
"items": {
"type": "object",
"properties": {
"trigger": { "type": "string" },
"response": { "type": "string" },
"condition": { "type": "string" }
}
}
},
"notes": { "type": "string" }
},
"additionalProperties": true
}

View File

@@ -0,0 +1,136 @@
import { BrowserRouter, Routes, Route, NavLink } from 'react-router-dom';
import Dashboard from './pages/Dashboard';
import Chat from './pages/Chat';
import Characters from './pages/Characters';
import Editor from './pages/Editor';
import Memories from './pages/Memories';
function NavItem({ to, children, icon }) {
return (
<NavLink
to={to}
className={({ isActive }) =>
`flex items-center gap-3 px-4 py-2.5 rounded-lg text-sm font-medium transition-colors ${
isActive
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200 hover:bg-gray-800/50'
}`
}
>
{icon}
<span>{children}</span>
</NavLink>
);
}
function Layout({ children }) {
return (
<div className="h-screen bg-gray-950 flex overflow-hidden">
{/* Sidebar */}
<aside className="w-64 bg-gray-900 border-r border-gray-800 flex flex-col shrink-0">
{/* Logo */}
<div className="px-6 py-5 border-b border-gray-800">
<div className="flex items-center gap-3">
<div className="w-9 h-9 rounded-lg bg-gradient-to-br from-indigo-500 to-purple-600 flex items-center justify-center">
<svg className="w-5 h-5 text-white" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M2.25 12l8.954-8.955c.44-.439 1.152-.439 1.591 0L21.75 12M4.5 9.75v10.125c0 .621.504 1.125 1.125 1.125H9.75v-4.875c0-.621.504-1.125 1.125-1.125h2.25c.621 0 1.125.504 1.125 1.125V21h4.125c.621 0 1.125-.504 1.125-1.125V9.75M8.25 21h8.25" />
</svg>
</div>
<div>
<h1 className="text-lg font-bold text-white tracking-tight">HomeAI</h1>
<p className="text-xs text-gray-500">LINDBLUM</p>
</div>
</div>
</div>
{/* Nav */}
<nav className="flex-1 px-3 py-4 space-y-1">
<NavItem
to="/"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3.75 6A2.25 2.25 0 016 3.75h2.25A2.25 2.25 0 0110.5 6v2.25a2.25 2.25 0 01-2.25 2.25H6a2.25 2.25 0 01-2.25-2.25V6zM3.75 15.75A2.25 2.25 0 016 13.5h2.25a2.25 2.25 0 012.25 2.25V18a2.25 2.25 0 01-2.25 2.25H6A2.25 2.25 0 013.75 18v-2.25zM13.5 6a2.25 2.25 0 012.25-2.25H18A2.25 2.25 0 0120.25 6v2.25A2.25 2.25 0 0118 10.5h-2.25a2.25 2.25 0 01-2.25-2.25V6zM13.5 15.75a2.25 2.25 0 012.25-2.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-2.25A2.25 2.25 0 0113.5 18v-2.25z" />
</svg>
}
>
Dashboard
</NavItem>
<NavItem
to="/chat"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M8.625 12a.375.375 0 11-.75 0 .375.375 0 01.75 0zm0 0H8.25m4.125 0a.375.375 0 11-.75 0 .375.375 0 01.75 0zm0 0H12m4.125 0a.375.375 0 11-.75 0 .375.375 0 01.75 0zm0 0h-.375M21 12c0 4.556-4.03 8.25-9 8.25a9.764 9.764 0 01-2.555-.337A5.972 5.972 0 015.41 20.97a5.969 5.969 0 01-.474-.065 4.48 4.48 0 00.978-2.025c.09-.457-.133-.901-.467-1.226C3.93 16.178 3 14.189 3 12c0-4.556 4.03-8.25 9-8.25s9 3.694 9 8.25z" />
</svg>
}
>
Chat
</NavItem>
<NavItem
to="/characters"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.75 6a3.75 3.75 0 11-7.5 0 3.75 3.75 0 017.5 0zM4.501 20.118a7.5 7.5 0 0114.998 0A17.933 17.933 0 0112 21.75c-2.676 0-5.216-.584-7.499-1.632z" />
</svg>
}
>
Characters
</NavItem>
<NavItem
to="/memories"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
}
>
Memories
</NavItem>
<NavItem
to="/editor"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.324.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.24-.438.613-.431.992a6.759 6.759 0 010 .255c-.007.378.138.75.43.99l1.005.828c.424.35.534.954.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.57 6.57 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.28c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.02-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.992a6.932 6.932 0 010-.255c.007-.378-.138-.75-.43-.99l-1.004-.828a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.087.22-.128.332-.183.582-.495.644-.869l.214-1.281z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
}
>
Editor
</NavItem>
</nav>
{/* Footer */}
<div className="px-6 py-4 border-t border-gray-800">
<p className="text-xs text-gray-600">HomeAI v0.1.0</p>
<p className="text-xs text-gray-700">Mac Mini M4 Pro</p>
</div>
</aside>
{/* Main content */}
<main className="flex-1 overflow-hidden flex flex-col">
{children}
</main>
</div>
);
}
function App() {
return (
<BrowserRouter>
<Layout>
<Routes>
<Route path="/" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Dashboard /></div></div>} />
<Route path="/chat" element={<Chat />} />
<Route path="/characters" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Characters /></div></div>} />
<Route path="/memories" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Memories /></div></div>} />
<Route path="/editor" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Editor /></div></div>} />
</Routes>
</Layout>
</BrowserRouter>
);
}
export default App;

View File

@@ -0,0 +1,41 @@
import { useEffect, useRef } from 'react'
import MessageBubble from './MessageBubble'
import ThinkingIndicator from './ThinkingIndicator'
export default function ChatPanel({ messages, isLoading, onReplay, character }) {
const bottomRef = useRef(null)
const name = character?.name || 'AI'
const image = character?.image || null
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' })
}, [messages, isLoading])
if (messages.length === 0 && !isLoading) {
return (
<div className="flex-1 flex items-center justify-center">
<div className="text-center">
{image ? (
<img src={image} alt={name} className="w-20 h-20 rounded-full object-cover mx-auto mb-4 ring-2 ring-indigo-500/30" />
) : (
<div className="w-20 h-20 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4">
<span className="text-indigo-400 text-2xl">{name[0]}</span>
</div>
)}
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm {name}</h2>
<p className="text-gray-500 text-sm">Type a message or press the mic to talk</p>
</div>
</div>
)
}
return (
<div className="flex-1 overflow-y-auto py-4">
{messages.map((msg) => (
<MessageBubble key={msg.id} message={msg} onReplay={onReplay} character={character} />
))}
{isLoading && <ThinkingIndicator character={character} />}
<div ref={bottomRef} />
</div>
)
}

View File

@@ -0,0 +1,70 @@
function timeAgo(dateStr) {
if (!dateStr) return ''
const diff = Date.now() - new Date(dateStr).getTime()
const mins = Math.floor(diff / 60000)
if (mins < 1) return 'just now'
if (mins < 60) return `${mins}m ago`
const hours = Math.floor(mins / 60)
if (hours < 24) return `${hours}h ago`
const days = Math.floor(hours / 24)
return `${days}d ago`
}
export default function ConversationList({ conversations, activeId, onCreate, onSelect, onDelete }) {
return (
<div className="w-72 border-r border-gray-800 flex flex-col bg-gray-950 shrink-0">
{/* New chat button */}
<div className="p-3 border-b border-gray-800">
<button
onClick={onCreate}
className="w-full flex items-center justify-center gap-2 px-3 py-2 bg-indigo-600 hover:bg-indigo-500 text-white text-sm rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
New chat
</button>
</div>
{/* Conversation list */}
<div className="flex-1 overflow-y-auto">
{conversations.length === 0 ? (
<p className="text-xs text-gray-600 text-center py-6">No conversations yet</p>
) : (
conversations.map(conv => (
<div
key={conv.id}
onClick={() => onSelect(conv.id)}
className={`group flex items-start gap-2 px-3 py-2.5 cursor-pointer border-b border-gray-800/50 transition-colors ${
conv.id === activeId
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:bg-gray-800/50 hover:text-gray-200'
}`}
>
<div className="flex-1 min-w-0">
<p className="text-sm truncate">
{conv.title || 'New conversation'}
</p>
<div className="flex items-center gap-2 mt-0.5">
{conv.characterName && (
<span className="text-xs text-indigo-400/70">{conv.characterName}</span>
)}
<span className="text-xs text-gray-600">{timeAgo(conv.updatedAt)}</span>
</div>
</div>
<button
onClick={(e) => { e.stopPropagation(); onDelete(conv.id) }}
className="opacity-0 group-hover:opacity-100 p-1 text-gray-500 hover:text-red-400 transition-all shrink-0 mt-0.5"
title="Delete"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
))
)}
</div>
</div>
)
}

View File

@@ -0,0 +1,53 @@
import { useState, useRef } from 'react'
import VoiceButton from './VoiceButton'
export default function InputBar({ onSend, onVoiceToggle, isLoading, isRecording, isTranscribing }) {
const [text, setText] = useState('')
const inputRef = useRef(null)
const handleSubmit = (e) => {
e.preventDefault()
if (!text.trim() || isLoading) return
onSend(text)
setText('')
}
const handleKeyDown = (e) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
handleSubmit(e)
}
}
return (
<form onSubmit={handleSubmit} className="border-t border-gray-800 bg-gray-950 px-4 py-3 shrink-0">
<div className="flex items-end gap-2 max-w-3xl mx-auto">
<VoiceButton
isRecording={isRecording}
isTranscribing={isTranscribing}
onToggle={onVoiceToggle}
disabled={isLoading}
/>
<textarea
ref={inputRef}
value={text}
onChange={(e) => setText(e.target.value)}
onKeyDown={handleKeyDown}
placeholder="Type a message..."
rows={1}
className="flex-1 bg-gray-800 text-gray-100 rounded-xl px-4 py-2.5 text-sm resize-none placeholder-gray-500 focus:outline-none focus:ring-1 focus:ring-indigo-500 min-h-[42px] max-h-32"
disabled={isLoading}
/>
<button
type="submit"
disabled={!text.trim() || isLoading}
className="w-10 h-10 rounded-full bg-indigo-600 text-white flex items-center justify-center shrink-0 hover:bg-indigo-500 disabled:opacity-40 disabled:hover:bg-indigo-600 transition-colors"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 12L3.269 3.126A59.768 59.768 0 0121.485 12 59.77 59.77 0 013.27 20.876L5.999 12zm0 0h7.5" />
</svg>
</button>
</div>
</form>
)
}

View File

@@ -0,0 +1,125 @@
import { useState } from 'react'
function Avatar({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
if (image) {
return <img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 mt-0.5 ring-1 ring-gray-700" />
}
return (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">{name[0]}</span>
</div>
)
}
function ImageOverlay({ src, onClose }) {
return (
<div
className="fixed inset-0 z-50 bg-black/80 flex items-center justify-center cursor-zoom-out"
onClick={onClose}
>
<img
src={src}
alt="Full size"
className="max-w-[90vw] max-h-[90vh] object-contain rounded-lg shadow-2xl"
onClick={(e) => e.stopPropagation()}
/>
<button
onClick={onClose}
className="absolute top-4 right-4 text-white/70 hover:text-white transition-colors p-2"
>
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
)
}
const IMAGE_URL_RE = /(https?:\/\/[^\s]+\.(?:png|jpg|jpeg|gif|webp))/gi
function RichContent({ text }) {
const [overlayImage, setOverlayImage] = useState(null)
const parts = []
let lastIndex = 0
let match
IMAGE_URL_RE.lastIndex = 0
while ((match = IMAGE_URL_RE.exec(text)) !== null) {
if (match.index > lastIndex) {
parts.push({ type: 'text', value: text.slice(lastIndex, match.index) })
}
parts.push({ type: 'image', value: match[1] })
lastIndex = IMAGE_URL_RE.lastIndex
}
if (lastIndex < text.length) {
parts.push({ type: 'text', value: text.slice(lastIndex) })
}
if (parts.length === 1 && parts[0].type === 'text') {
return <>{text}</>
}
return (
<>
{parts.map((part, i) =>
part.type === 'image' ? (
<button
key={i}
onClick={() => setOverlayImage(part.value)}
className="block my-2 cursor-zoom-in"
>
<img
src={part.value}
alt="Generated image"
className="rounded-xl max-w-full max-h-80 object-contain"
loading="lazy"
/>
</button>
) : (
<span key={i}>{part.value}</span>
)
)}
{overlayImage && <ImageOverlay src={overlayImage} onClose={() => setOverlayImage(null)} />}
</>
)
}
export default function MessageBubble({ message, onReplay, character }) {
const isUser = message.role === 'user'
return (
<div className={`flex ${isUser ? 'justify-end' : 'justify-start'} px-4 py-1.5`}>
<div className={`flex items-start gap-3 max-w-[80%] ${isUser ? 'flex-row-reverse' : ''}`}>
{!isUser && <Avatar character={character} />}
<div>
<div
className={`rounded-2xl px-4 py-2.5 text-sm leading-relaxed whitespace-pre-wrap ${
isUser
? 'bg-indigo-600 text-white'
: message.isError
? 'bg-red-900/40 text-red-200 border border-red-800/50'
: 'bg-gray-800 text-gray-100'
}`}
>
{isUser ? message.content : <RichContent text={message.content} />}
</div>
{!isUser && !message.isError && onReplay && (
<button
onClick={() => onReplay(message.content)}
className="mt-1 ml-1 text-gray-500 hover:text-indigo-400 transition-colors"
title="Replay audio"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.536 8.464a5 5 0 010 7.072M17.95 6.05a8 8 0 010 11.9M6.5 9H4a1 1 0 00-1 1v4a1 1 0 001 1h2.5l4 4V5l-4 4z" />
</svg>
</button>
)}
</div>
</div>
</div>
)
}

View File

@@ -0,0 +1,106 @@
import { VOICES, TTS_ENGINES } from '../lib/constants'
export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate }) {
if (!isOpen) return null
const isKokoro = !settings.ttsEngine || settings.ttsEngine === 'kokoro'
return (
<>
<div className="fixed inset-0 bg-black/50 z-40" onClick={onClose} />
<div className="fixed right-0 top-0 bottom-0 w-80 bg-gray-900 border-l border-gray-800 z-50 flex flex-col">
<div className="flex items-center justify-between px-4 py-3 border-b border-gray-800">
<h2 className="text-sm font-medium text-gray-200">Settings</h2>
<button onClick={onClose} className="text-gray-500 hover:text-gray-300">
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
<div className="flex-1 overflow-y-auto p-4 space-y-5">
{/* TTS Engine */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">TTS Engine</label>
<select
value={settings.ttsEngine || 'kokoro'}
onChange={(e) => onUpdate('ttsEngine', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{TTS_ENGINES.map((e) => (
<option key={e.id} value={e.id}>{e.label}</option>
))}
</select>
</div>
{/* Voice */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Voice</label>
{isKokoro ? (
<select
value={settings.voice}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{VOICES.map((v) => (
<option key={v.id} value={v.id}>{v.label}</option>
))}
</select>
) : (
<div>
<input
type="text"
value={settings.voice || ''}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
placeholder={settings.ttsEngine === 'elevenlabs' ? 'ElevenLabs voice ID' : 'Voice identifier'}
readOnly
/>
<p className="text-xs text-gray-500 mt-1">
Set via active character profile
</p>
</div>
)}
</div>
{/* Auto TTS */}
<div className="flex items-center justify-between">
<div>
<div className="text-sm text-gray-200">Auto-speak responses</div>
<div className="text-xs text-gray-500">Speak assistant replies aloud</div>
</div>
<button
onClick={() => onUpdate('autoTts', !settings.autoTts)}
className={`relative w-10 h-6 rounded-full transition-colors ${
settings.autoTts ? 'bg-indigo-600' : 'bg-gray-700'
}`}
>
<span
className={`absolute top-0.5 left-0.5 w-5 h-5 rounded-full bg-white transition-transform ${
settings.autoTts ? 'translate-x-4' : ''
}`}
/>
</button>
</div>
{/* STT Mode */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Speech recognition</label>
<select
value={settings.sttMode}
onChange={(e) => onUpdate('sttMode', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="bridge">Wyoming STT (local)</option>
<option value="webspeech">Web Speech API (browser)</option>
</select>
<p className="text-xs text-gray-500 mt-1">
{settings.sttMode === 'bridge'
? 'Uses Whisper via the local bridge server'
: 'Uses browser built-in speech recognition'}
</p>
</div>
</div>
</div>
</>
)
}

View File

@@ -0,0 +1,11 @@
export default function StatusIndicator({ isOnline }) {
if (isOnline === null) {
return <span className="inline-block w-2.5 h-2.5 rounded-full bg-gray-500 animate-pulse" title="Checking..." />
}
return (
<span
className={`inline-block w-2.5 h-2.5 rounded-full ${isOnline ? 'bg-emerald-400' : 'bg-red-400'}`}
title={isOnline ? 'Bridge online' : 'Bridge offline'}
/>
)
}

View File

@@ -0,0 +1,21 @@
export default function ThinkingIndicator({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
return (
<div className="flex items-start gap-3 px-4 py-3">
{image ? (
<img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 ring-1 ring-gray-700" />
) : (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0">
<span className="text-indigo-400 text-sm">{name[0]}</span>
</div>
)}
<div className="flex items-center gap-1 pt-2.5">
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.2s_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.4s_infinite]" />
</div>
</div>
)
}

View File

@@ -0,0 +1,32 @@
export default function VoiceButton({ isRecording, isTranscribing, onToggle, disabled }) {
const handleClick = () => {
if (disabled || isTranscribing) return
onToggle()
}
return (
<button
onClick={handleClick}
disabled={disabled || isTranscribing}
className={`w-10 h-10 rounded-full flex items-center justify-center transition-all shrink-0 ${
isRecording
? 'bg-red-500 text-white shadow-[0_0_0_4px_rgba(239,68,68,0.3)] animate-pulse'
: isTranscribing
? 'bg-gray-700 text-gray-400 cursor-wait'
: 'bg-gray-800 text-gray-400 hover:bg-gray-700 hover:text-gray-200'
}`}
title={isRecording ? 'Stop recording' : isTranscribing ? 'Transcribing...' : 'Start recording (Space)'}
>
{isTranscribing ? (
<svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
) : (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18.75a6 6 0 006-6v-1.5m-6 7.5a6 6 0 01-6-6v-1.5m6 7.5v3.75m-3.75 0h7.5M12 15.75a3 3 0 01-3-3V4.5a3 3 0 116 0v8.25a3 3 0 01-3 3z" />
</svg>
)}
</button>
)
}

View File

@@ -0,0 +1,28 @@
import { useState, useEffect } from 'react'
const ACTIVE_KEY = 'homeai_active_character'
export function useActiveCharacter() {
const [character, setCharacter] = useState(null)
useEffect(() => {
const activeId = localStorage.getItem(ACTIVE_KEY)
if (!activeId) return
fetch(`/api/characters/${activeId}`)
.then(r => r.ok ? r.json() : null)
.then(profile => {
if (profile) {
setCharacter({
id: profile.id,
name: profile.data.display_name || profile.data.name || 'AI',
image: profile.image || null,
tts: profile.data.tts || null,
})
}
})
.catch(() => {})
}, [])
return character
}

View File

@@ -0,0 +1,18 @@
import { useState, useEffect, useRef } from 'react'
import { healthCheck } from '../lib/api'
export function useBridgeHealth() {
const [isOnline, setIsOnline] = useState(null)
const intervalRef = useRef(null)
useEffect(() => {
const check = async () => {
setIsOnline(await healthCheck())
}
check()
intervalRef.current = setInterval(check, 15000)
return () => clearInterval(intervalRef.current)
}, [])
return isOnline
}

View File

@@ -0,0 +1,124 @@
import { useState, useCallback, useEffect, useRef } from 'react'
import { sendMessage } from '../lib/api'
import { getConversation, saveConversation } from '../lib/conversationApi'
export function useChat(conversationId, conversationMeta, onConversationUpdate) {
const [messages, setMessages] = useState([])
const [isLoading, setIsLoading] = useState(false)
const [isLoadingConv, setIsLoadingConv] = useState(false)
const convRef = useRef(null)
const idRef = useRef(conversationId)
// Keep idRef in sync
useEffect(() => { idRef.current = conversationId }, [conversationId])
// Load conversation from server when ID changes
useEffect(() => {
if (!conversationId) {
setMessages([])
convRef.current = null
return
}
let cancelled = false
setIsLoadingConv(true)
getConversation(conversationId).then(conv => {
if (cancelled) return
if (conv) {
convRef.current = conv
setMessages(conv.messages || [])
} else {
convRef.current = null
setMessages([])
}
setIsLoadingConv(false)
}).catch(() => {
if (!cancelled) {
convRef.current = null
setMessages([])
setIsLoadingConv(false)
}
})
return () => { cancelled = true }
}, [conversationId])
// Persist conversation to server
const persist = useCallback(async (updatedMessages, title, overrideId) => {
const id = overrideId || idRef.current
if (!id) return
const now = new Date().toISOString()
const conv = {
id,
title: title || convRef.current?.title || '',
characterId: conversationMeta?.characterId || convRef.current?.characterId || '',
characterName: conversationMeta?.characterName || convRef.current?.characterName || '',
createdAt: convRef.current?.createdAt || now,
updatedAt: now,
messages: updatedMessages,
}
convRef.current = conv
await saveConversation(conv).catch(() => {})
if (onConversationUpdate) {
onConversationUpdate(id, {
title: conv.title,
updatedAt: conv.updatedAt,
messageCount: conv.messages.length,
})
}
}, [conversationMeta, onConversationUpdate])
// send accepts an optional overrideId for when the conversation was just created
const send = useCallback(async (text, overrideId) => {
if (!text.trim() || isLoading) return null
const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date().toISOString() }
const isFirstMessage = messages.length === 0
const newMessages = [...messages, userMsg]
setMessages(newMessages)
setIsLoading(true)
try {
const response = await sendMessage(text.trim(), conversationMeta?.characterId || null)
const assistantMsg = {
id: Date.now() + 1,
role: 'assistant',
content: response,
timestamp: new Date().toISOString(),
}
const allMessages = [...newMessages, assistantMsg]
setMessages(allMessages)
const title = isFirstMessage
? text.trim().slice(0, 80) + (text.trim().length > 80 ? '...' : '')
: undefined
await persist(allMessages, title, overrideId)
return response
} catch (err) {
const errorMsg = {
id: Date.now() + 1,
role: 'assistant',
content: `Error: ${err.message}`,
timestamp: new Date().toISOString(),
isError: true,
}
const allMessages = [...newMessages, errorMsg]
setMessages(allMessages)
await persist(allMessages, undefined, overrideId)
return null
} finally {
setIsLoading(false)
}
}, [isLoading, messages, persist])
const clearHistory = useCallback(async () => {
setMessages([])
if (idRef.current) {
await persist([], undefined)
}
}, [persist])
return { messages, isLoading, isLoadingConv, send, clearHistory }
}

View File

@@ -0,0 +1,66 @@
import { useState, useEffect, useCallback } from 'react'
import { listConversations, saveConversation, deleteConversation as deleteConv } from '../lib/conversationApi'
const ACTIVE_KEY = 'homeai_active_conversation'
export function useConversations() {
const [conversations, setConversations] = useState([])
const [activeId, setActiveId] = useState(() => localStorage.getItem(ACTIVE_KEY) || null)
const [isLoading, setIsLoading] = useState(true)
const loadList = useCallback(async () => {
try {
const list = await listConversations()
setConversations(list)
} catch {
setConversations([])
} finally {
setIsLoading(false)
}
}, [])
useEffect(() => { loadList() }, [loadList])
const select = useCallback((id) => {
setActiveId(id)
if (id) {
localStorage.setItem(ACTIVE_KEY, id)
} else {
localStorage.removeItem(ACTIVE_KEY)
}
}, [])
const create = useCallback(async (characterId, characterName) => {
const id = `conv_${Date.now()}`
const now = new Date().toISOString()
const conv = {
id,
title: '',
characterId: characterId || '',
characterName: characterName || '',
createdAt: now,
updatedAt: now,
messages: [],
}
await saveConversation(conv)
setConversations(prev => [{ ...conv, messageCount: 0 }, ...prev])
select(id)
return id
}, [select])
const remove = useCallback(async (id) => {
await deleteConv(id)
setConversations(prev => prev.filter(c => c.id !== id))
if (activeId === id) {
select(null)
}
}, [activeId, select])
const updateMeta = useCallback((id, updates) => {
setConversations(prev => prev.map(c =>
c.id === id ? { ...c, ...updates } : c
))
}, [])
return { conversations, activeId, isLoading, select, create, remove, updateMeta, refresh: loadList }
}

View File

@@ -0,0 +1,27 @@
import { useState, useCallback } from 'react'
import { DEFAULT_SETTINGS } from '../lib/constants'
const STORAGE_KEY = 'homeai_dashboard_settings'
function loadSettings() {
try {
const stored = localStorage.getItem(STORAGE_KEY)
return stored ? { ...DEFAULT_SETTINGS, ...JSON.parse(stored) } : { ...DEFAULT_SETTINGS }
} catch {
return { ...DEFAULT_SETTINGS }
}
}
export function useSettings() {
const [settings, setSettings] = useState(loadSettings)
const updateSetting = useCallback((key, value) => {
setSettings((prev) => {
const next = { ...prev, [key]: value }
localStorage.setItem(STORAGE_KEY, JSON.stringify(next))
return next
})
}, [])
return { settings, updateSetting }
}

View File

@@ -0,0 +1,56 @@
import { useState, useRef, useCallback } from 'react'
import { synthesize } from '../lib/api'
export function useTtsPlayback(voice, engine = 'kokoro', model = null) {
const [isPlaying, setIsPlaying] = useState(false)
const audioCtxRef = useRef(null)
const sourceRef = useRef(null)
const getAudioContext = () => {
if (!audioCtxRef.current || audioCtxRef.current.state === 'closed') {
audioCtxRef.current = new AudioContext()
}
return audioCtxRef.current
}
const speak = useCallback(async (text) => {
if (!text) return
// Stop any current playback
if (sourceRef.current) {
try { sourceRef.current.stop() } catch {}
}
setIsPlaying(true)
try {
const audioData = await synthesize(text, voice, engine, model)
const ctx = getAudioContext()
if (ctx.state === 'suspended') await ctx.resume()
const audioBuffer = await ctx.decodeAudioData(audioData)
const source = ctx.createBufferSource()
source.buffer = audioBuffer
source.connect(ctx.destination)
sourceRef.current = source
source.onended = () => {
setIsPlaying(false)
sourceRef.current = null
}
source.start()
} catch (err) {
console.error('TTS playback error:', err)
setIsPlaying(false)
}
}, [voice, engine, model])
const stop = useCallback(() => {
if (sourceRef.current) {
try { sourceRef.current.stop() } catch {}
sourceRef.current = null
}
setIsPlaying(false)
}, [])
return { isPlaying, speak, stop }
}

View File

@@ -0,0 +1,91 @@
import { useState, useRef, useCallback } from 'react'
import { createRecorder } from '../lib/audio'
import { transcribe } from '../lib/api'
export function useVoiceInput(sttMode = 'bridge') {
const [isRecording, setIsRecording] = useState(false)
const [isTranscribing, setIsTranscribing] = useState(false)
const recorderRef = useRef(null)
const webSpeechRef = useRef(null)
const startRecording = useCallback(async () => {
if (isRecording) return
if (sttMode === 'webspeech' && 'webkitSpeechRecognition' in window) {
return startWebSpeech()
}
try {
const recorder = createRecorder()
recorderRef.current = recorder
await recorder.start()
setIsRecording(true)
} catch (err) {
console.error('Mic access error:', err)
}
}, [isRecording, sttMode])
const stopRecording = useCallback(async () => {
if (!isRecording) return null
if (sttMode === 'webspeech' && webSpeechRef.current) {
return stopWebSpeech()
}
setIsRecording(false)
setIsTranscribing(true)
try {
const wavBlob = await recorderRef.current.stop()
recorderRef.current = null
const text = await transcribe(wavBlob)
return text
} catch (err) {
console.error('Transcription error:', err)
return null
} finally {
setIsTranscribing(false)
}
}, [isRecording, sttMode])
function startWebSpeech() {
return new Promise((resolve) => {
const SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition
const recognition = new SpeechRecognition()
recognition.continuous = false
recognition.interimResults = false
recognition.lang = 'en-US'
webSpeechRef.current = { recognition, resolve: null }
recognition.start()
setIsRecording(true)
resolve()
})
}
function stopWebSpeech() {
return new Promise((resolve) => {
const { recognition } = webSpeechRef.current
recognition.onresult = (e) => {
const text = e.results[0]?.[0]?.transcript || ''
setIsRecording(false)
webSpeechRef.current = null
resolve(text)
}
recognition.onerror = () => {
setIsRecording(false)
webSpeechRef.current = null
resolve(null)
}
recognition.onend = () => {
setIsRecording(false)
if (webSpeechRef.current) {
webSpeechRef.current = null
resolve(null)
}
}
recognition.stop()
})
}
return { isRecording, isTranscribing, startRecording, stopRecording }
}

View File

@@ -0,0 +1,35 @@
@import "tailwindcss";
body {
margin: 0;
background-color: #030712;
color: #f3f4f6;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#root {
min-height: 100vh;
}
/* Scrollbar styling for dark theme */
::-webkit-scrollbar {
width: 8px;
}
::-webkit-scrollbar-track {
background: #0a0a0f;
}
::-webkit-scrollbar-thumb {
background: #374151;
border-radius: 4px;
}
::-webkit-scrollbar-thumb:hover {
background: #4b5563;
}
::selection {
background: rgba(99, 102, 241, 0.3);
}

View File

@@ -0,0 +1,49 @@
import Ajv from 'ajv'
import schema from '../../schema/character.schema.json'
const ajv = new Ajv({ allErrors: true, strict: false })
const validate = ajv.compile(schema)
/**
* Migrate a v1 character config to v2 in-place.
* Removes live2d/vtube fields, converts gaze_preset to gaze_presets array,
* and initialises new persona fields.
*/
export function migrateV1toV2(config) {
config.schema_version = 2
// Remove deprecated fields
delete config.live2d_expressions
delete config.vtube_ws_triggers
// Convert single gaze_preset string → gaze_presets array
if ('gaze_preset' in config) {
const old = config.gaze_preset
config.gaze_presets = old ? [{ preset: old, trigger: 'self-portrait' }] : []
delete config.gaze_preset
}
if (!config.gaze_presets) {
config.gaze_presets = []
}
// Initialise new fields if absent
if (config.background === undefined) config.background = ''
if (config.dialogue_style === undefined) config.dialogue_style = ''
if (config.appearance === undefined) config.appearance = ''
if (config.skills === undefined) config.skills = []
return config
}
export function validateCharacter(config) {
// Auto-migrate v1 → v2
if (config.schema_version === 1 || config.schema_version === undefined) {
migrateV1toV2(config)
}
const valid = validate(config)
if (!valid) {
throw new Error(ajv.errorsText(validate.errors))
}
return true
}

View File

@@ -0,0 +1,68 @@
const MAX_RETRIES = 3
const RETRY_DELAY_MS = 2000
async function fetchWithRetry(url, options, retries = MAX_RETRIES) {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const res = await fetch(url, options)
if (res.status === 502 && attempt < retries) {
// Bridge unreachable — wait and retry
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
continue
}
return res
} catch (err) {
if (attempt >= retries) throw err
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
}
}
}
export async function sendMessage(text, characterId = null) {
const payload = { message: text, agent: 'main' }
if (characterId) payload.character_id = characterId
const res = await fetchWithRetry('/api/agent/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
})
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Request failed' }))
throw new Error(err.error || `HTTP ${res.status}`)
}
const data = await res.json()
return data.response
}
export async function synthesize(text, voice, engine = 'kokoro', model = null) {
const payload = { text, voice, engine }
if (model) payload.model = model
const res = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
})
if (!res.ok) throw new Error('TTS failed')
return await res.arrayBuffer()
}
export async function transcribe(wavBlob) {
const res = await fetch('/api/stt', {
method: 'POST',
headers: { 'Content-Type': 'audio/wav' },
body: wavBlob,
})
if (!res.ok) throw new Error('STT failed')
const data = await res.json()
return data.text
}
export async function healthCheck() {
try {
const res = await fetch('/api/health?url=' + encodeURIComponent('http://localhost:8081/'), { signal: AbortSignal.timeout(5000) })
const data = await res.json()
return data.status === 'online'
} catch {
return false
}
}

View File

@@ -0,0 +1,92 @@
const TARGET_RATE = 16000
export function createRecorder() {
let audioCtx
let source
let processor
let stream
let samples = []
async function start() {
samples = []
stream = await navigator.mediaDevices.getUserMedia({
audio: { channelCount: 1, sampleRate: TARGET_RATE },
})
audioCtx = new AudioContext({ sampleRate: TARGET_RATE })
source = audioCtx.createMediaStreamSource(stream)
processor = audioCtx.createScriptProcessor(4096, 1, 1)
processor.onaudioprocess = (e) => {
const input = e.inputBuffer.getChannelData(0)
samples.push(new Float32Array(input))
}
source.connect(processor)
processor.connect(audioCtx.destination)
}
async function stop() {
processor.disconnect()
source.disconnect()
stream.getTracks().forEach((t) => t.stop())
await audioCtx.close()
const totalLength = samples.reduce((acc, s) => acc + s.length, 0)
const merged = new Float32Array(totalLength)
let offset = 0
for (const chunk of samples) {
merged.set(chunk, offset)
offset += chunk.length
}
const resampled = audioCtx.sampleRate !== TARGET_RATE
? resample(merged, audioCtx.sampleRate, TARGET_RATE)
: merged
return encodeWav(resampled, TARGET_RATE)
}
return { start, stop }
}
function resample(samples, fromRate, toRate) {
const ratio = fromRate / toRate
const newLength = Math.round(samples.length / ratio)
const result = new Float32Array(newLength)
for (let i = 0; i < newLength; i++) {
result[i] = samples[Math.round(i * ratio)]
}
return result
}
function encodeWav(samples, sampleRate) {
const numSamples = samples.length
const buffer = new ArrayBuffer(44 + numSamples * 2)
const view = new DataView(buffer)
writeString(view, 0, 'RIFF')
view.setUint32(4, 36 + numSamples * 2, true)
writeString(view, 8, 'WAVE')
writeString(view, 12, 'fmt ')
view.setUint32(16, 16, true)
view.setUint16(20, 1, true)
view.setUint16(22, 1, true)
view.setUint32(24, sampleRate, true)
view.setUint32(28, sampleRate * 2, true)
view.setUint16(32, 2, true)
view.setUint16(34, 16, true)
writeString(view, 36, 'data')
view.setUint32(40, numSamples * 2, true)
for (let i = 0; i < numSamples; i++) {
const s = Math.max(-1, Math.min(1, samples[i]))
view.setInt16(44 + i * 2, s < 0 ? s * 0x8000 : s * 0x7fff, true)
}
return new Blob([buffer], { type: 'audio/wav' })
}
function writeString(view, offset, str) {
for (let i = 0; i < str.length; i++) {
view.setUint8(offset + i, str.charCodeAt(i))
}
}

View File

@@ -0,0 +1,45 @@
export const DEFAULT_VOICE = 'af_heart'
export const VOICES = [
{ id: 'af_heart', label: 'Heart (F, US)' },
{ id: 'af_alloy', label: 'Alloy (F, US)' },
{ id: 'af_aoede', label: 'Aoede (F, US)' },
{ id: 'af_bella', label: 'Bella (F, US)' },
{ id: 'af_jessica', label: 'Jessica (F, US)' },
{ id: 'af_kore', label: 'Kore (F, US)' },
{ id: 'af_nicole', label: 'Nicole (F, US)' },
{ id: 'af_nova', label: 'Nova (F, US)' },
{ id: 'af_river', label: 'River (F, US)' },
{ id: 'af_sarah', label: 'Sarah (F, US)' },
{ id: 'af_sky', label: 'Sky (F, US)' },
{ id: 'am_adam', label: 'Adam (M, US)' },
{ id: 'am_echo', label: 'Echo (M, US)' },
{ id: 'am_eric', label: 'Eric (M, US)' },
{ id: 'am_fenrir', label: 'Fenrir (M, US)' },
{ id: 'am_liam', label: 'Liam (M, US)' },
{ id: 'am_michael', label: 'Michael (M, US)' },
{ id: 'am_onyx', label: 'Onyx (M, US)' },
{ id: 'am_puck', label: 'Puck (M, US)' },
{ id: 'bf_alice', label: 'Alice (F, UK)' },
{ id: 'bf_emma', label: 'Emma (F, UK)' },
{ id: 'bf_isabella', label: 'Isabella (F, UK)' },
{ id: 'bf_lily', label: 'Lily (F, UK)' },
{ id: 'bm_daniel', label: 'Daniel (M, UK)' },
{ id: 'bm_fable', label: 'Fable (M, UK)' },
{ id: 'bm_george', label: 'George (M, UK)' },
{ id: 'bm_lewis', label: 'Lewis (M, UK)' },
]
export const TTS_ENGINES = [
{ id: 'kokoro', label: 'Kokoro (local)' },
{ id: 'chatterbox', label: 'Chatterbox (voice clone)' },
{ id: 'qwen3', label: 'Qwen3 TTS' },
{ id: 'elevenlabs', label: 'ElevenLabs (cloud)' },
]
export const DEFAULT_SETTINGS = {
ttsEngine: 'kokoro',
voice: DEFAULT_VOICE,
autoTts: true,
sttMode: 'bridge',
}

View File

@@ -0,0 +1,25 @@
export async function listConversations() {
const res = await fetch('/api/conversations')
if (!res.ok) throw new Error(`Failed to list conversations: ${res.status}`)
return res.json()
}
export async function getConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`)
if (!res.ok) return null
return res.json()
}
export async function saveConversation(conversation) {
const res = await fetch('/api/conversations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(conversation),
})
if (!res.ok) throw new Error(`Failed to save conversation: ${res.status}`)
}
export async function deleteConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`, { method: 'DELETE' })
if (!res.ok) throw new Error(`Failed to delete conversation: ${res.status}`)
}

View File

@@ -0,0 +1,45 @@
export async function getPersonalMemories(characterId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`)
if (!res.ok) return { characterId, memories: [] }
return res.json()
}
export async function savePersonalMemory(characterId, memory) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deletePersonalMemory(characterId, memoryId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}
export async function getGeneralMemories() {
const res = await fetch('/api/memories/general')
if (!res.ok) return { memories: [] }
return res.json()
}
export async function saveGeneralMemory(memory) {
const res = await fetch('/api/memories/general', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deleteGeneralMemory(memoryId) {
const res = await fetch(`/api/memories/general/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}

View File

@@ -0,0 +1,10 @@
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App.jsx'
createRoot(document.getElementById('root')).render(
<StrictMode>
<App />
</StrictMode>,
)

View File

@@ -0,0 +1,471 @@
import { useState, useEffect, useCallback } from 'react';
import { useNavigate } from 'react-router-dom';
import { validateCharacter } from '../lib/SchemaValidator';
const ACTIVE_KEY = 'homeai_active_character';
function getActiveId() {
return localStorage.getItem(ACTIVE_KEY) || null;
}
function setActiveId(id) {
localStorage.setItem(ACTIVE_KEY, id);
}
export default function Characters() {
const [profiles, setProfiles] = useState([]);
const [activeId, setActive] = useState(getActiveId);
const [error, setError] = useState(null);
const [dragOver, setDragOver] = useState(false);
const [loading, setLoading] = useState(true);
const [satMap, setSatMap] = useState({ default: '', satellites: {} });
const [newSatId, setNewSatId] = useState('');
const [newSatChar, setNewSatChar] = useState('');
const navigate = useNavigate();
// Load profiles and satellite map on mount
useEffect(() => {
Promise.all([
fetch('/api/characters').then(r => r.json()),
fetch('/api/satellite-map').then(r => r.json()),
])
.then(([chars, map]) => {
setProfiles(chars);
setSatMap(map);
setLoading(false);
})
.catch(err => { setError(`Failed to load: ${err.message}`); setLoading(false); });
}, []);
const saveSatMap = useCallback(async (updated) => {
setSatMap(updated);
await fetch('/api/satellite-map', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(updated),
});
}, []);
const saveProfile = useCallback(async (profile) => {
const res = await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
if (!res.ok) throw new Error('Failed to save profile');
}, []);
const deleteProfile = useCallback(async (id) => {
const safeId = id.replace(/[^a-zA-Z0-9_\-\.]/g, '_');
await fetch(`/api/characters/${safeId}`, { method: 'DELETE' });
}, []);
const handleImport = (e) => {
const files = Array.from(e.target?.files || []);
importFiles(files);
if (e.target) e.target.value = '';
};
const importFiles = (files) => {
files.forEach(file => {
if (!file.name.endsWith('.json')) return;
const reader = new FileReader();
reader.onload = async (ev) => {
try {
const data = JSON.parse(ev.target.result);
validateCharacter(data);
const id = data.name + '_' + Date.now();
const profile = { id, data, image: null, addedAt: new Date().toISOString() };
await saveProfile(profile);
setProfiles(prev => [...prev, profile]);
setError(null);
} catch (err) {
setError(`Import failed for ${file.name}: ${err.message}`);
}
};
reader.readAsText(file);
});
};
const handleDrop = (e) => {
e.preventDefault();
setDragOver(false);
const files = Array.from(e.dataTransfer.files);
importFiles(files);
};
const handleImageUpload = (profileId, e) => {
const file = e.target.files[0];
if (!file) return;
const reader = new FileReader();
reader.onload = async (ev) => {
const updated = profiles.map(p => p.id === profileId ? { ...p, image: ev.target.result } : p);
const profile = updated.find(p => p.id === profileId);
if (profile) await saveProfile(profile);
setProfiles(updated);
};
reader.readAsDataURL(file);
};
const removeProfile = async (id) => {
await deleteProfile(id);
setProfiles(prev => prev.filter(p => p.id !== id));
if (activeId === id) {
setActive(null);
localStorage.removeItem(ACTIVE_KEY);
}
};
const activateProfile = (id) => {
setActive(id);
setActiveId(id);
// Sync active character's TTS settings to chat settings
const profile = profiles.find(p => p.id === id);
if (profile?.data?.tts) {
const tts = profile.data.tts;
const engine = tts.engine || 'kokoro';
let voice;
if (engine === 'kokoro') voice = tts.kokoro_voice || 'af_heart';
else if (engine === 'elevenlabs') voice = tts.elevenlabs_voice_id || '';
else if (engine === 'chatterbox') voice = tts.voice_ref_path || '';
else voice = '';
try {
const raw = localStorage.getItem('homeai_dashboard_settings');
const settings = raw ? JSON.parse(raw) : {};
localStorage.setItem('homeai_dashboard_settings', JSON.stringify({
...settings,
ttsEngine: engine,
voice: voice,
}));
} catch { /* ignore */ }
}
};
const exportProfile = (profile) => {
const dataStr = "data:text/json;charset=utf-8," + encodeURIComponent(JSON.stringify(profile.data, null, 2));
const a = document.createElement('a');
a.href = dataStr;
a.download = `${profile.data.name || 'character'}.json`;
a.click();
};
const editProfile = (profile) => {
sessionStorage.setItem('edit_character', JSON.stringify(profile.data));
sessionStorage.setItem('edit_character_profile_id', profile.id);
navigate('/editor');
};
const activeProfile = profiles.find(p => p.id === activeId);
return (
<div className="space-y-8">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-3xl font-bold text-gray-100">Characters</h1>
<p className="text-sm text-gray-500 mt-1">
{profiles.length} profile{profiles.length !== 1 ? 's' : ''} stored
{activeProfile && (
<span className="ml-2 text-emerald-400">
Active: {activeProfile.data.display_name || activeProfile.data.name}
</span>
)}
</p>
</div>
<div className="flex gap-3">
<button
onClick={() => {
sessionStorage.removeItem('edit_character');
sessionStorage.removeItem('edit_character_profile_id');
navigate('/editor');
}}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
New Character
</button>
<label className="flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg cursor-pointer border border-gray-700 transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5m-13.5-9L12 3m0 0l4.5 4.5M12 3v13.5" />
</svg>
Import JSON
<input type="file" accept=".json" multiple className="hidden" onChange={handleImport} />
</label>
</div>
</div>
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
</div>
)}
{/* Drop zone */}
<div
onDragOver={(e) => { e.preventDefault(); setDragOver(true); }}
onDragLeave={() => setDragOver(false)}
onDrop={handleDrop}
className={`border-2 border-dashed rounded-xl p-8 text-center transition-colors ${
dragOver
? 'border-indigo-500 bg-indigo-500/10'
: 'border-gray-700 hover:border-gray-600'
}`}
>
<svg className="w-10 h-10 mx-auto text-gray-600 mb-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5m-13.5-9L12 3m0 0l4.5 4.5M12 3v13.5" />
</svg>
<p className="text-gray-500 text-sm">Drop character JSON files here to import</p>
</div>
{/* Profile grid */}
{loading ? (
<div className="text-center py-16">
<p className="text-gray-500">Loading characters...</p>
</div>
) : profiles.length === 0 ? (
<div className="text-center py-16">
<svg className="w-16 h-16 mx-auto text-gray-700 mb-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.75 6a3.75 3.75 0 11-7.5 0 3.75 3.75 0 017.5 0zM4.501 20.118a7.5 7.5 0 0114.998 0A17.933 17.933 0 0112 21.75c-2.676 0-5.216-.584-7.499-1.632z" />
</svg>
<p className="text-gray-500">No character profiles yet. Import a JSON file to get started.</p>
</div>
) : (
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
{profiles.map(profile => {
const isActive = profile.id === activeId;
const char = profile.data;
return (
<div
key={profile.id}
className={`relative rounded-xl border overflow-hidden transition-all duration-200 ${
isActive
? 'border-emerald-500/60 bg-emerald-500/5 ring-1 ring-emerald-500/30'
: 'border-gray-700 bg-gray-800/50 hover:border-gray-600'
}`}
>
{/* Image area */}
<div className="relative h-48 bg-gray-900 flex items-center justify-center overflow-hidden group">
{profile.image ? (
<img
src={profile.image}
alt={char.display_name || char.name}
className="w-full h-full object-cover"
/>
) : (
<div className="text-6xl font-bold text-gray-700 select-none">
{(char.display_name || char.name || '?')[0].toUpperCase()}
</div>
)}
<label className="absolute inset-0 flex items-center justify-center bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity cursor-pointer">
<div className="text-center">
<svg className="w-8 h-8 mx-auto text-white/80 mb-1" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6.827 6.175A2.31 2.31 0 015.186 7.23c-.38.054-.757.112-1.134.175C2.999 7.58 2.25 8.507 2.25 9.574V18a2.25 2.25 0 002.25 2.25h15A2.25 2.25 0 0021.75 18V9.574c0-1.067-.75-1.994-1.802-2.169a47.865 47.865 0 00-1.134-.175 2.31 2.31 0 01-1.64-1.055l-.822-1.316a2.192 2.192 0 00-1.736-1.039 48.774 48.774 0 00-5.232 0 2.192 2.192 0 00-1.736 1.039l-.821 1.316z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M16.5 12.75a4.5 4.5 0 11-9 0 4.5 4.5 0 019 0z" />
</svg>
<span className="text-xs text-white/70">Change image</span>
</div>
<input
type="file"
accept="image/*"
className="hidden"
onChange={(e) => handleImageUpload(profile.id, e)}
/>
</label>
{isActive && (
<span className="absolute top-2 right-2 px-2 py-0.5 bg-emerald-500 text-white text-xs font-medium rounded-full">
Active
</span>
)}
</div>
{/* Info */}
<div className="p-4 space-y-3">
<div>
<h3 className="text-lg font-semibold text-gray-200">
{char.display_name || char.name}
</h3>
<p className="text-xs text-gray-500 mt-0.5">{char.description}</p>
</div>
<div className="flex flex-wrap gap-1.5">
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.tts?.engine || 'kokoro'}
</span>
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.model_overrides?.primary || 'default'}
</span>
{char.tts?.engine === 'kokoro' && char.tts?.kokoro_voice && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.tts.kokoro_voice}
</span>
)}
{char.tts?.engine === 'elevenlabs' && char.tts?.elevenlabs_voice_id && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.elevenlabs_voice_id}>
{char.tts.elevenlabs_voice_name || char.tts.elevenlabs_voice_id.slice(0, 8) + '…'}
</span>
)}
{char.tts?.engine === 'chatterbox' && char.tts?.voice_ref_path && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.voice_ref_path}>
{char.tts.voice_ref_path.split('/').pop()}
</span>
)}
{(() => {
const defaultPreset = char.gaze_presets?.find(gp => gp.trigger === 'self-portrait')?.preset
|| char.gaze_presets?.[0]?.preset
|| char.gaze_preset
|| null;
return defaultPreset ? (
<span className="px-2 py-0.5 bg-violet-500/20 text-violet-300 text-xs rounded-full border border-violet-500/30" title={`GAZE: ${defaultPreset}`}>
{defaultPreset}
</span>
) : null;
})()}
</div>
<div className="flex gap-2 pt-1">
{!isActive ? (
<button
onClick={() => activateProfile(profile.id)}
className="flex-1 px-3 py-1.5 bg-emerald-600 hover:bg-emerald-500 text-white text-sm rounded-lg transition-colors"
>
Activate
</button>
) : (
<button
disabled
className="flex-1 px-3 py-1.5 bg-gray-700 text-gray-500 text-sm rounded-lg cursor-not-allowed"
>
Active
</button>
)}
<button
onClick={() => editProfile(profile)}
className="px-3 py-1.5 bg-gray-700 hover:bg-gray-600 text-gray-300 text-sm rounded-lg transition-colors"
title="Edit"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M16.862 4.487l1.687-1.688a1.875 1.875 0 112.652 2.652L10.582 16.07a4.5 4.5 0 01-1.897 1.13L6 18l.8-2.685a4.5 4.5 0 011.13-1.897l8.932-8.931zm0 0L19.5 7.125M18 14v4.75A2.25 2.25 0 0115.75 21H5.25A2.25 2.25 0 013 18.75V8.25A2.25 2.25 0 015.25 6H10" />
</svg>
</button>
<button
onClick={() => exportProfile(profile)}
className="px-3 py-1.5 bg-gray-700 hover:bg-gray-600 text-gray-300 text-sm rounded-lg transition-colors"
title="Export"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5M16.5 12L12 16.5m0 0L7.5 12m4.5 4.5V3" />
</svg>
</button>
<button
onClick={() => removeProfile(profile.id)}
className="px-3 py-1.5 bg-gray-700 hover:bg-red-600 text-gray-300 hover:text-white text-sm rounded-lg transition-colors"
title="Delete"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
</div>
</div>
);
})}
</div>
)}
{/* Satellite Assignment */}
{!loading && profiles.length > 0 && (
<div className="bg-gray-900 border border-gray-800 rounded-xl p-5 space-y-4">
<div>
<h2 className="text-lg font-semibold text-gray-200">Satellite Routing</h2>
<p className="text-xs text-gray-500 mt-1">Assign characters to voice satellites. Unmapped satellites use the default.</p>
</div>
{/* Default character */}
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400 w-32 shrink-0">Default</label>
<select
value={satMap.default || ''}
onChange={(e) => saveSatMap({ ...satMap, default: e.target.value })}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- None --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
</div>
{/* Per-satellite assignments */}
{Object.entries(satMap.satellites || {}).map(([satId, charId]) => (
<div key={satId} className="flex items-center gap-3">
<span className="text-sm text-gray-300 w-32 shrink-0 truncate font-mono" title={satId}>{satId}</span>
<select
value={charId}
onChange={(e) => {
const updated = { ...satMap, satellites: { ...satMap.satellites, [satId]: e.target.value } };
saveSatMap(updated);
}}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
const { [satId]: _, ...rest } = satMap.satellites;
saveSatMap({ ...satMap, satellites: rest });
}}
className="px-2 py-1.5 bg-gray-700 hover:bg-red-600 text-gray-400 hover:text-white rounded-lg transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
))}
{/* Add new satellite */}
<div className="flex items-center gap-3 pt-2 border-t border-gray-800">
<input
type="text"
value={newSatId}
onChange={(e) => setNewSatId(e.target.value)}
placeholder="Satellite ID (from bridge log)"
className="w-32 shrink-0 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500 font-mono"
/>
<select
value={newSatChar}
onChange={(e) => setNewSatChar(e.target.value)}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- Select Character --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
if (newSatId && newSatChar) {
saveSatMap({ ...satMap, satellites: { ...satMap.satellites, [newSatId]: newSatChar } });
setNewSatId('');
setNewSatChar('');
}
}}
disabled={!newSatId || !newSatChar}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
Add
</button>
</div>
</div>
)}
</div>
);
}

View File

@@ -0,0 +1,146 @@
import { useState, useCallback } from 'react'
import ChatPanel from '../components/ChatPanel'
import InputBar from '../components/InputBar'
import StatusIndicator from '../components/StatusIndicator'
import SettingsDrawer from '../components/SettingsDrawer'
import ConversationList from '../components/ConversationList'
import { useSettings } from '../hooks/useSettings'
import { useBridgeHealth } from '../hooks/useBridgeHealth'
import { useChat } from '../hooks/useChat'
import { useTtsPlayback } from '../hooks/useTtsPlayback'
import { useVoiceInput } from '../hooks/useVoiceInput'
import { useActiveCharacter } from '../hooks/useActiveCharacter'
import { useConversations } from '../hooks/useConversations'
export default function Chat() {
const { settings, updateSetting } = useSettings()
const isOnline = useBridgeHealth()
const character = useActiveCharacter()
const {
conversations, activeId, isLoading: isLoadingList,
select, create, remove, updateMeta,
} = useConversations()
const convMeta = {
characterId: character?.id || '',
characterName: character?.name || '',
}
const { messages, isLoading, isLoadingConv, send, clearHistory } = useChat(activeId, convMeta, updateMeta)
// Use character's TTS config if available, fall back to global settings
const ttsEngine = character?.tts?.engine || settings.ttsEngine
const ttsVoice = ttsEngine === 'elevenlabs'
? (character?.tts?.elevenlabs_voice_id || settings.voice)
: (character?.tts?.kokoro_voice || settings.voice)
const ttsModel = ttsEngine === 'elevenlabs' ? (character?.tts?.elevenlabs_model || null) : null
const { isPlaying, speak, stop } = useTtsPlayback(ttsVoice, ttsEngine, ttsModel)
const { isRecording, isTranscribing, startRecording, stopRecording } = useVoiceInput(settings.sttMode)
const [settingsOpen, setSettingsOpen] = useState(false)
const handleSend = useCallback(async (text) => {
// Auto-create a conversation if none is active
let newId = null
if (!activeId) {
newId = await create(convMeta.characterId, convMeta.characterName)
}
const response = await send(text, newId)
if (response && settings.autoTts) {
speak(response)
}
}, [activeId, create, convMeta, send, settings.autoTts, speak])
const handleVoiceToggle = useCallback(async () => {
if (isRecording) {
const text = await stopRecording()
if (text) handleSend(text)
} else {
startRecording()
}
}, [isRecording, stopRecording, startRecording, handleSend])
const handleNewChat = useCallback(() => {
create(convMeta.characterId, convMeta.characterName)
}, [create, convMeta])
return (
<div className="flex-1 flex min-h-0">
{/* Conversation sidebar */}
<ConversationList
conversations={conversations}
activeId={activeId}
onCreate={handleNewChat}
onSelect={select}
onDelete={remove}
/>
{/* Chat area */}
<div className="flex-1 flex flex-col min-h-0 min-w-0">
{/* Status bar */}
<header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50 shrink-0">
<div className="flex items-center gap-2">
<StatusIndicator isOnline={isOnline} />
<span className="text-xs text-gray-500">
{isOnline === null ? 'Connecting...' : isOnline ? 'Connected' : 'Offline'}
</span>
</div>
<div className="flex items-center gap-2">
{messages.length > 0 && (
<button
onClick={clearHistory}
className="text-xs text-gray-500 hover:text-gray-300 transition-colors px-2 py-1"
title="Clear conversation"
>
Clear
</button>
)}
{isPlaying && (
<button
onClick={stop}
className="text-xs text-indigo-400 hover:text-indigo-300 transition-colors px-2 py-1"
title="Stop speaking"
>
Stop audio
</button>
)}
<button
onClick={() => setSettingsOpen(true)}
className="text-gray-500 hover:text-gray-300 transition-colors p-1"
title="Settings"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.325.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.241-.438.613-.43.992a7.723 7.723 0 010 .255c-.008.378.137.75.43.991l1.004.827c.424.35.534.955.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.47 6.47 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.281c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.019-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.991a6.932 6.932 0 010-.255c.007-.38-.138-.751-.43-.992l-1.004-.827a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.086.22-.128.332-.183.582-.495.644-.869l.214-1.28z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
</button>
</div>
</header>
{/* Messages */}
<ChatPanel
messages={messages}
isLoading={isLoading || isLoadingConv}
onReplay={speak}
character={character}
/>
{/* Input */}
<InputBar
onSend={handleSend}
onVoiceToggle={handleVoiceToggle}
isLoading={isLoading}
isRecording={isRecording}
isTranscribing={isTranscribing}
/>
{/* Settings drawer */}
<SettingsDrawer
isOpen={settingsOpen}
onClose={() => setSettingsOpen(false)}
settings={settings}
onUpdate={updateSetting}
/>
</div>
</div>
)
}

View File

@@ -0,0 +1,376 @@
import { useState, useEffect, useCallback } from 'react';
const SERVICES = [
{
name: 'Ollama',
url: 'http://localhost:11434',
healthPath: '/api/tags',
uiUrl: null,
description: 'Local LLM runtime',
category: 'AI & LLM',
restart: { type: 'launchd', id: 'gui/501/com.homeai.ollama' },
},
{
name: 'Open WebUI',
url: 'http://localhost:3030',
healthPath: '/',
uiUrl: 'http://localhost:3030',
description: 'Chat interface',
category: 'AI & LLM',
restart: { type: 'docker', id: 'homeai-open-webui' },
},
{
name: 'OpenClaw Gateway',
url: 'http://localhost:8080',
healthPath: '/',
uiUrl: null,
description: 'Agent gateway',
category: 'Agent',
restart: { type: 'launchd', id: 'gui/501/com.homeai.openclaw' },
},
{
name: 'OpenClaw Bridge',
url: 'http://localhost:8081',
healthPath: '/',
uiUrl: null,
description: 'HTTP-to-CLI bridge',
category: 'Agent',
restart: { type: 'launchd', id: 'gui/501/com.homeai.openclaw-bridge' },
},
{
name: 'Wyoming STT',
url: 'http://localhost:10300',
healthPath: '/',
uiUrl: null,
description: 'Whisper speech-to-text',
category: 'Voice',
tcp: true,
restart: { type: 'launchd', id: 'gui/501/com.homeai.wyoming-stt' },
},
{
name: 'Wyoming TTS',
url: 'http://localhost:10301',
healthPath: '/',
uiUrl: null,
description: 'Kokoro text-to-speech',
category: 'Voice',
tcp: true,
restart: { type: 'launchd', id: 'gui/501/com.homeai.wyoming-tts' },
},
{
name: 'Wyoming Satellite',
url: 'http://localhost:10700',
healthPath: '/',
uiUrl: null,
description: 'Mac Mini mic/speaker satellite',
category: 'Voice',
tcp: true,
restart: { type: 'launchd', id: 'gui/501/com.homeai.wyoming-satellite' },
},
{
name: 'Home Assistant',
url: 'https://10.0.0.199:8123',
healthPath: '/api/',
uiUrl: 'https://10.0.0.199:8123',
description: 'Smart home platform',
category: 'Smart Home',
},
{
name: 'Uptime Kuma',
url: 'http://localhost:3001',
healthPath: '/',
uiUrl: 'http://localhost:3001',
description: 'Service health monitoring',
category: 'Infrastructure',
restart: { type: 'docker', id: 'homeai-uptime-kuma' },
},
{
name: 'n8n',
url: 'http://localhost:5678',
healthPath: '/',
uiUrl: 'http://localhost:5678',
description: 'Workflow automation',
category: 'Infrastructure',
restart: { type: 'docker', id: 'homeai-n8n' },
},
{
name: 'code-server',
url: 'http://localhost:8090',
healthPath: '/',
uiUrl: 'http://localhost:8090',
description: 'Browser-based VS Code',
category: 'Infrastructure',
restart: { type: 'docker', id: 'homeai-code-server' },
},
{
name: 'Portainer',
url: 'https://10.0.0.199:9443',
healthPath: '/',
uiUrl: 'https://10.0.0.199:9443',
description: 'Docker management',
category: 'Infrastructure',
},
{
name: 'Gitea',
url: 'http://10.0.0.199:3000',
healthPath: '/',
uiUrl: 'http://10.0.0.199:3000',
description: 'Self-hosted Git',
category: 'Infrastructure',
},
];
const CATEGORY_ICONS = {
'AI & LLM': (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.813 15.904L9 18.75l-.813-2.846a4.5 4.5 0 00-3.09-3.09L2.25 12l2.846-.813a4.5 4.5 0 003.09-3.09L9 5.25l.813 2.846a4.5 4.5 0 003.09 3.09L15.75 12l-2.846.813a4.5 4.5 0 00-3.09 3.09zM18.259 8.715L18 9.75l-.259-1.035a3.375 3.375 0 00-2.455-2.456L14.25 6l1.036-.259a3.375 3.375 0 002.455-2.456L18 2.25l.259 1.035a3.375 3.375 0 002.455 2.456L21.75 6l-1.036.259a3.375 3.375 0 00-2.455 2.456zM16.894 20.567L16.5 21.75l-.394-1.183a2.25 2.25 0 00-1.423-1.423L13.5 18.75l1.183-.394a2.25 2.25 0 001.423-1.423l.394-1.183.394 1.183a2.25 2.25 0 001.423 1.423l1.183.394-1.183.394a2.25 2.25 0 00-1.423 1.423z" />
</svg>
),
'Agent': (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M8.25 3v1.5M4.5 8.25H3m18 0h-1.5M4.5 12H3m18 0h-1.5m-15 3.75H3m18 0h-1.5M8.25 19.5V21M12 3v1.5m0 15V21m3.75-18v1.5m0 15V21m-9-1.5h10.5a2.25 2.25 0 002.25-2.25V6.75a2.25 2.25 0 00-2.25-2.25H6.75A2.25 2.25 0 004.5 6.75v10.5a2.25 2.25 0 002.25 2.25zm.75-12h9v9h-9v-9z" />
</svg>
),
'Voice': (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18.75a6 6 0 006-6v-1.5m-6 7.5a6 6 0 01-6-6v-1.5m6 7.5v3.75m-3.75 0h7.5M12 15.75a3 3 0 01-3-3V4.5a3 3 0 116 0v8.25a3 3 0 01-3 3z" />
</svg>
),
'Smart Home': (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M2.25 12l8.954-8.955c.44-.439 1.152-.439 1.591 0L21.75 12M4.5 9.75v10.125c0 .621.504 1.125 1.125 1.125H9.75v-4.875c0-.621.504-1.125 1.125-1.125h2.25c.621 0 1.125.504 1.125 1.125V21h4.125c.621 0 1.125-.504 1.125-1.125V9.75M8.25 21h8.25" />
</svg>
),
'Infrastructure': (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M5.25 14.25h13.5m-13.5 0a3 3 0 01-3-3m3 3a3 3 0 100 6h13.5a3 3 0 100-6m-16.5-3a3 3 0 013-3h13.5a3 3 0 013 3m-19.5 0a4.5 4.5 0 01.9-2.7L5.737 5.1a3.375 3.375 0 012.7-1.35h7.126c1.062 0 2.062.5 2.7 1.35l2.587 3.45a4.5 4.5 0 01.9 2.7m0 0a3 3 0 01-3 3m0 3h.008v.008h-.008v-.008zm0-6h.008v.008h-.008v-.008zm-3 6h.008v.008h-.008v-.008zm0-6h.008v.008h-.008v-.008z" />
</svg>
),
};
function StatusDot({ status }) {
const colors = {
online: 'bg-emerald-400 shadow-emerald-400/50',
offline: 'bg-red-400 shadow-red-400/50',
checking: 'bg-amber-400 shadow-amber-400/50 animate-pulse',
unknown: 'bg-gray-500',
};
return (
<span className={`inline-block w-2.5 h-2.5 rounded-full shadow-lg ${colors[status] || colors.unknown}`} />
);
}
export default function Dashboard() {
const [statuses, setStatuses] = useState(() =>
Object.fromEntries(SERVICES.map(s => [s.name, { status: 'checking', lastCheck: null, responseTime: null }]))
);
const [lastRefresh, setLastRefresh] = useState(null);
const [restarting, setRestarting] = useState({});
const checkService = useCallback(async (service) => {
try {
const target = encodeURIComponent(service.url + service.healthPath);
const modeParam = service.tcp ? '&mode=tcp' : '';
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 8000);
const res = await fetch(`/api/health?url=${target}${modeParam}`, { signal: controller.signal });
clearTimeout(timeout);
const data = await res.json();
return { status: data.status, lastCheck: new Date(), responseTime: data.responseTime };
} catch {
return { status: 'offline', lastCheck: new Date(), responseTime: null };
}
}, []);
const refreshAll = useCallback(async () => {
setStatuses(prev =>
Object.fromEntries(Object.entries(prev).map(([k, v]) => [k, { ...v, status: 'checking' }]))
);
const results = await Promise.allSettled(
SERVICES.map(async (service) => {
const result = await checkService(service);
return { name: service.name, ...result };
})
);
const newStatuses = {};
for (const r of results) {
if (r.status === 'fulfilled') {
newStatuses[r.value.name] = {
status: r.value.status,
lastCheck: r.value.lastCheck,
responseTime: r.value.responseTime,
};
}
}
setStatuses(prev => ({ ...prev, ...newStatuses }));
setLastRefresh(new Date());
}, [checkService]);
useEffect(() => {
refreshAll();
const interval = setInterval(refreshAll, 30000);
return () => clearInterval(interval);
}, [refreshAll]);
const restartService = useCallback(async (service) => {
if (!service.restart) return;
setRestarting(prev => ({ ...prev, [service.name]: true }));
try {
const res = await fetch('/api/service/restart', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(service.restart),
});
const data = await res.json();
if (!data.ok) {
console.error(`Restart failed for ${service.name}:`, data.error);
}
setTimeout(async () => {
const result = await checkService(service);
setStatuses(prev => ({ ...prev, [service.name]: result }));
setRestarting(prev => ({ ...prev, [service.name]: false }));
}, 3000);
} catch (err) {
console.error(`Restart failed for ${service.name}:`, err);
setRestarting(prev => ({ ...prev, [service.name]: false }));
}
}, [checkService]);
const categories = [...new Set(SERVICES.map(s => s.category))];
const onlineCount = Object.values(statuses).filter(s => s.status === 'online').length;
const offlineCount = Object.values(statuses).filter(s => s.status === 'offline').length;
const totalCount = SERVICES.length;
const allOnline = onlineCount === totalCount;
return (
<div className="space-y-8">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-3xl font-bold text-gray-100">Service Status</h1>
<p className="text-sm text-gray-500 mt-1">
{onlineCount}/{totalCount} services online
{lastRefresh && (
<span className="ml-3">
Last check: {lastRefresh.toLocaleTimeString()}
</span>
)}
</p>
</div>
<button
onClick={refreshAll}
className="flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg border border-gray-700 transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M16.023 9.348h4.992v-.001M2.985 19.644v-4.992m0 0h4.992m-4.993 0l3.181 3.183a8.25 8.25 0 0013.803-3.7M4.031 9.865a8.25 8.25 0 0113.803-3.7l3.181 3.182" />
</svg>
Refresh
</button>
</div>
{/* Summary bar */}
<div className="h-2 rounded-full bg-gray-800 overflow-hidden flex">
{allOnline ? (
<div
className="h-full bg-gradient-to-r from-purple-500 to-indigo-500 transition-all duration-500"
style={{ width: '100%' }}
/>
) : (
<>
<div
className="h-full bg-gradient-to-r from-emerald-500 to-emerald-400 transition-all duration-500"
style={{ width: `${(onlineCount / totalCount) * 100}%` }}
/>
<div
className="h-full bg-gradient-to-r from-red-500 to-red-400 transition-all duration-500"
style={{ width: `${(offlineCount / totalCount) * 100}%` }}
/>
</>
)}
</div>
{/* Service grid by category */}
{categories.map(category => (
<div key={category}>
<div className="flex items-center gap-2 mb-4">
<span className="text-gray-400">{CATEGORY_ICONS[category]}</span>
<h2 className="text-lg font-semibold text-gray-300">{category}</h2>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{SERVICES.filter(s => s.category === category).map(service => {
const st = statuses[service.name] || { status: 'unknown' };
return (
<div
key={service.name}
className={`relative rounded-xl border p-4 transition-all duration-200 ${
st.status === 'online'
? 'bg-gray-800/50 border-gray-700 hover:border-emerald-500/50'
: st.status === 'offline'
? 'bg-gray-800/50 border-red-500/30 hover:border-red-500/50'
: 'bg-gray-800/50 border-gray-700'
}`}
>
<div className="flex items-start justify-between">
<div className="flex-1">
<div className="flex items-center gap-2">
<StatusDot status={st.status} />
<h3 className="font-medium text-gray-200">{service.name}</h3>
</div>
<p className="text-xs text-gray-500 mt-1">{service.description}</p>
{st.responseTime !== null && (
<p className="text-xs text-gray-600 mt-0.5">{st.responseTime}ms</p>
)}
</div>
<div className="flex items-center gap-2">
{service.restart && st.status === 'offline' && (
<button
onClick={() => restartService(service)}
disabled={restarting[service.name]}
className="text-xs px-2.5 py-1 rounded-md bg-amber-600/80 hover:bg-amber-500 disabled:bg-gray-700 disabled:text-gray-500 text-white transition-colors flex items-center gap-1"
>
{restarting[service.name] ? (
<>
<svg className="w-3 h-3 animate-spin" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
Restarting
</>
) : (
<>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M5.636 18.364a9 9 0 010-12.728m12.728 0a9 9 0 010 12.728M12 9v3m0 0v3m0-3h3m-3 0H9" />
</svg>
Restart
</>
)}
</button>
)}
{service.uiUrl && (
<a
href={service.uiUrl}
target="_blank"
rel="noopener noreferrer"
className="text-xs px-2.5 py-1 rounded-md bg-gray-700 hover:bg-gray-600 text-gray-300 transition-colors flex items-center gap-1"
>
Open
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M13.5 6H5.25A2.25 2.25 0 003 8.25v10.5A2.25 2.25 0 005.25 21h10.5A2.25 2.25 0 0018 18.75V10.5m-10.5 6L21 3m0 0h-5.25M21 3v5.25" />
</svg>
</a>
)}
</div>
</div>
</div>
);
})}
</div>
</div>
))}
</div>
);
}

View File

@@ -0,0 +1,915 @@
import React, { useState, useEffect, useRef } from 'react';
import { validateCharacter, migrateV1toV2 } from '../lib/SchemaValidator';
const DEFAULT_CHARACTER = {
schema_version: 2,
name: "",
display_name: "",
description: "",
background: "",
dialogue_style: "",
appearance: "",
skills: [],
system_prompt: "",
model_overrides: {
primary: "qwen3.5:35b-a3b",
fast: "qwen2.5:7b"
},
tts: {
engine: "kokoro",
kokoro_voice: "af_heart",
speed: 1.0
},
gaze_presets: [],
custom_rules: [],
notes: ""
};
export default function Editor() {
const [character, setCharacter] = useState(() => {
const editData = sessionStorage.getItem('edit_character');
if (editData) {
sessionStorage.removeItem('edit_character');
try {
const parsed = JSON.parse(editData);
// Auto-migrate v1 data
if (parsed.schema_version === 1 || !parsed.schema_version) {
migrateV1toV2(parsed);
}
return parsed;
} catch {
return DEFAULT_CHARACTER;
}
}
return DEFAULT_CHARACTER;
});
const [error, setError] = useState(null);
const [saved, setSaved] = useState(false);
const isEditing = !!sessionStorage.getItem('edit_character_profile_id');
// TTS preview state
const [ttsState, setTtsState] = useState('idle');
const [previewText, setPreviewText] = useState('');
const audioRef = useRef(null);
const objectUrlRef = useRef(null);
// ElevenLabs state
const [elevenLabsApiKey, setElevenLabsApiKey] = useState(localStorage.getItem('elevenlabs_api_key') || '');
const [elevenLabsVoices, setElevenLabsVoices] = useState([]);
const [elevenLabsModels, setElevenLabsModels] = useState([]);
const [isLoadingElevenLabs, setIsLoadingElevenLabs] = useState(false);
// GAZE presets state (from API)
const [availableGazePresets, setAvailableGazePresets] = useState([]);
const [isLoadingGaze, setIsLoadingGaze] = useState(false);
// Character lookup state
const [lookupName, setLookupName] = useState('');
const [lookupFranchise, setLookupFranchise] = useState('');
const [isLookingUp, setIsLookingUp] = useState(false);
const [lookupDone, setLookupDone] = useState(false);
// Skills input state
const [newSkill, setNewSkill] = useState('');
const fetchElevenLabsData = async (key) => {
if (!key) return;
setIsLoadingElevenLabs(true);
try {
const headers = { 'xi-api-key': key };
const [voicesRes, modelsRes] = await Promise.all([
fetch('https://api.elevenlabs.io/v1/voices', { headers }),
fetch('https://api.elevenlabs.io/v1/models', { headers })
]);
if (!voicesRes.ok || !modelsRes.ok) {
throw new Error('Failed to fetch from ElevenLabs API (check API key)');
}
const voicesData = await voicesRes.json();
const modelsData = await modelsRes.json();
setElevenLabsVoices(voicesData.voices || []);
setElevenLabsModels(modelsData.filter(m => m.can_do_text_to_speech) || []);
localStorage.setItem('elevenlabs_api_key', key);
} catch (err) {
setError(err.message);
} finally {
setIsLoadingElevenLabs(false);
}
};
useEffect(() => {
if (elevenLabsApiKey && character.tts.engine === 'elevenlabs') {
fetchElevenLabsData(elevenLabsApiKey);
}
}, [character.tts.engine]);
// Fetch GAZE presets on mount
useEffect(() => {
setIsLoadingGaze(true);
fetch('/api/gaze/presets')
.then(r => r.ok ? r.json() : { presets: [] })
.then(data => setAvailableGazePresets(data.presets || []))
.catch(() => {})
.finally(() => setIsLoadingGaze(false));
}, []);
useEffect(() => {
return () => {
if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
if (objectUrlRef.current) { URL.revokeObjectURL(objectUrlRef.current); }
window.speechSynthesis.cancel();
};
}, []);
const handleExport = () => {
try {
validateCharacter(character);
setError(null);
const dataStr = "data:text/json;charset=utf-8," + encodeURIComponent(JSON.stringify(character, null, 2));
const a = document.createElement('a');
a.href = dataStr;
a.download = `${character.name || 'character'}.json`;
document.body.appendChild(a);
a.click();
a.remove();
} catch (err) {
setError(err.message);
}
};
const handleSaveToProfiles = async () => {
try {
validateCharacter(character);
setError(null);
const profileId = sessionStorage.getItem('edit_character_profile_id');
let profile;
if (profileId) {
const res = await fetch('/api/characters');
const profiles = await res.json();
const existing = profiles.find(p => p.id === profileId);
profile = existing
? { ...existing, data: character }
: { id: profileId, data: character, image: null, addedAt: new Date().toISOString() };
// Keep the profile ID in sessionStorage so subsequent saves update the same file
} else {
const id = character.name + '_' + Date.now();
profile = { id, data: character, image: null, addedAt: new Date().toISOString() };
// Store the new ID so subsequent saves update the same file
sessionStorage.setItem('edit_character_profile_id', profile.id);
}
await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
setSaved(true);
setTimeout(() => setSaved(false), 2000);
} catch (err) {
setError(err.message);
}
};
const handleImport = (e) => {
const file = e.target.files[0];
if (!file) return;
const reader = new FileReader();
reader.onload = (e) => {
try {
const importedChar = JSON.parse(e.target.result);
validateCharacter(importedChar);
setCharacter(importedChar);
setError(null);
} catch (err) {
setError(`Import failed: ${err.message}`);
}
};
reader.readAsText(file);
};
// Character lookup from MCP
const handleCharacterLookup = async () => {
if (!lookupName || !lookupFranchise) return;
setIsLookingUp(true);
setError(null);
try {
const res = await fetch('/api/character-lookup', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: lookupName, franchise: lookupFranchise }),
});
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Lookup failed' }));
throw new Error(err.error || `Lookup returned ${res.status}`);
}
const data = await res.json();
// Build dialogue_style from personality + notable quotes
let dialogueStyle = data.personality || '';
if (data.notable_quotes?.length) {
dialogueStyle += '\n\nExample dialogue:\n' + data.notable_quotes.map(q => `"${q}"`).join('\n');
}
// Filter abilities to clean text-only entries (skip image captions)
const skills = (data.abilities || [])
.filter(a => a.length > 20 && !a.includes('.jpg') && !a.includes('.png'))
.slice(0, 10);
// Auto-generate system prompt
const promptName = character.display_name || lookupName;
const personality = data.personality ? data.personality.split('.').slice(0, 3).join('.') + '.' : '';
const systemPrompt = `You are ${promptName} from ${lookupFranchise}. ${personality} Stay in character at all times. Respond naturally and conversationally.`;
setCharacter(prev => ({
...prev,
name: prev.name || lookupName.toLowerCase().replace(/\s+/g, '_'),
display_name: prev.display_name || lookupName,
description: data.description ? data.description.split('.').slice(0, 2).join('.') + '.' : prev.description,
background: data.background || prev.background,
appearance: data.appearance || prev.appearance,
dialogue_style: dialogueStyle || prev.dialogue_style,
skills: skills.length > 0 ? skills : prev.skills,
system_prompt: prev.system_prompt || systemPrompt,
}));
setLookupDone(true);
} catch (err) {
setError(`Character lookup failed: ${err.message}`);
} finally {
setIsLookingUp(false);
}
};
const handleChange = (field, value) => {
setCharacter(prev => ({ ...prev, [field]: value }));
};
const handleNestedChange = (parent, field, value) => {
setCharacter(prev => ({
...prev,
[parent]: { ...prev[parent], [field]: value }
}));
};
// Skills helpers
const addSkill = () => {
const trimmed = newSkill.trim();
if (!trimmed) return;
setCharacter(prev => ({
...prev,
skills: [...(prev.skills || []), trimmed]
}));
setNewSkill('');
};
const removeSkill = (index) => {
setCharacter(prev => {
const updated = [...(prev.skills || [])];
updated.splice(index, 1);
return { ...prev, skills: updated };
});
};
// GAZE preset helpers
const addGazePreset = () => {
setCharacter(prev => ({
...prev,
gaze_presets: [...(prev.gaze_presets || []), { preset: '', trigger: 'self-portrait' }]
}));
};
const removeGazePreset = (index) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated.splice(index, 1);
return { ...prev, gaze_presets: updated };
});
};
const handleGazePresetChange = (index, field, value) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated[index] = { ...updated[index], [field]: value };
return { ...prev, gaze_presets: updated };
});
};
// Custom rules helpers
const handleRuleChange = (index, field, value) => {
setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])];
newRules[index] = { ...newRules[index], [field]: value };
return { ...prev, custom_rules: newRules };
});
};
const addRule = () => {
setCharacter(prev => ({
...prev,
custom_rules: [...(prev.custom_rules || []), { trigger: "", response: "", condition: "" }]
}));
};
const removeRule = (index) => {
setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])];
newRules.splice(index, 1);
return { ...prev, custom_rules: newRules };
});
};
// TTS preview
const stopPreview = () => {
if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
if (objectUrlRef.current) { URL.revokeObjectURL(objectUrlRef.current); objectUrlRef.current = null; }
window.speechSynthesis.cancel();
setTtsState('idle');
};
const previewTTS = async () => {
stopPreview();
const text = previewText || `Hi, I am ${character.display_name || character.name}. This is a preview of my voice.`;
const engine = character.tts.engine;
let bridgeBody = null;
if (engine === 'kokoro') {
bridgeBody = { text, voice: character.tts.kokoro_voice, engine: 'kokoro' };
} else if (engine === 'elevenlabs' && character.tts.elevenlabs_voice_id) {
bridgeBody = { text, voice: character.tts.elevenlabs_voice_id, engine: 'elevenlabs', model: character.tts.elevenlabs_model };
}
if (bridgeBody) {
setTtsState('loading');
let blob;
try {
const response = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(bridgeBody)
});
if (!response.ok) throw new Error('TTS bridge returned ' + response.status);
blob = await response.blob();
} catch (err) {
setTtsState('idle');
setError(`${engine} preview failed: ${err.message}. Falling back to browser TTS.`);
runBrowserTTS(text);
return;
}
const url = URL.createObjectURL(blob);
objectUrlRef.current = url;
const audio = new Audio(url);
audio.playbackRate = character.tts.speed;
audio.onended = () => { stopPreview(); };
audio.onerror = () => { stopPreview(); };
audioRef.current = audio;
setTtsState('playing');
audio.play().catch(() => {});
} else {
runBrowserTTS(text);
}
};
const runBrowserTTS = (text) => {
const utterance = new SpeechSynthesisUtterance(text);
utterance.rate = character.tts.speed;
const voices = window.speechSynthesis.getVoices();
const preferredVoice = voices.find(v => v.lang.startsWith('en') && v.name.includes('Female')) || voices.find(v => v.lang.startsWith('en'));
if (preferredVoice) utterance.voice = preferredVoice;
setTtsState('playing');
utterance.onend = () => setTtsState('idle');
window.speechSynthesis.cancel();
window.speechSynthesis.speak(utterance);
};
const inputClass = "w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none transition-colors";
const selectClass = "w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none transition-colors";
const labelClass = "block text-sm font-medium text-gray-400 mb-1";
const cardClass = "bg-gray-900 border border-gray-800 p-5 rounded-xl space-y-4";
return (
<div className="space-y-6">
<div className="flex justify-between items-center">
<div>
<h1 className="text-3xl font-bold text-gray-100">Character Editor</h1>
<p className="text-sm text-gray-500 mt-1">
{character.display_name || character.name
? `Editing: ${character.display_name || character.name}`
: 'New character'}
</p>
</div>
<div className="flex gap-3">
<label className="cursor-pointer flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg border border-gray-700 transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5m-13.5-9L12 3m0 0l4.5 4.5M12 3v13.5" />
</svg>
Import
<input type="file" accept=".json" className="hidden" onChange={handleImport} />
</label>
<button
onClick={handleSaveToProfiles}
className={`flex items-center gap-2 px-4 py-2 rounded-lg transition-colors ${
saved
? 'bg-emerald-600 text-white'
: 'bg-indigo-600 hover:bg-indigo-500 text-white'
}`}
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
{saved
? <path strokeLinecap="round" strokeLinejoin="round" d="M4.5 12.75l6 6 9-13.5" />
: <path strokeLinecap="round" strokeLinejoin="round" d="M17.593 3.322c1.1.128 1.907 1.077 1.907 2.185V21L12 17.25 4.5 21V5.507c0-1.108.806-2.057 1.907-2.185a48.507 48.507 0 0111.186 0z" />
}
</svg>
{saved ? 'Saved' : 'Save to Profiles'}
</button>
<button
onClick={handleExport}
className="flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg border border-gray-700 transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5M16.5 12L12 16.5m0 0L7.5 12m4.5 4.5V3" />
</svg>
Export JSON
</button>
</div>
</div>
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Character Lookup — auto-fill from fictional character wiki */}
{!isEditing && (
<div className={cardClass}>
<div className="flex items-center gap-2">
<svg className="w-5 h-5 text-indigo-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M21 21l-5.197-5.197m0 0A7.5 7.5 0 105.196 5.196a7.5 7.5 0 0010.607 10.607z" />
</svg>
<h2 className="text-lg font-semibold text-gray-200">Auto-fill from Character</h2>
</div>
<p className="text-xs text-gray-500">Fetch character data from Fandom/Wikipedia to auto-populate fields. You can edit everything after.</p>
<div className="flex gap-3 items-end">
<div className="flex-1">
<label className={labelClass}>Character Name</label>
<input
type="text"
className={inputClass}
value={lookupName}
onChange={(e) => setLookupName(e.target.value)}
placeholder="e.g. Tifa Lockhart"
/>
</div>
<div className="flex-1">
<label className={labelClass}>Franchise / Series</label>
<input
type="text"
className={inputClass}
value={lookupFranchise}
onChange={(e) => setLookupFranchise(e.target.value)}
placeholder="e.g. Final Fantasy VII"
/>
</div>
<button
onClick={handleCharacterLookup}
disabled={isLookingUp || !lookupName || !lookupFranchise}
className={`flex items-center gap-2 px-5 py-2 rounded-lg text-white transition-colors whitespace-nowrap ${
isLookingUp
? 'bg-indigo-800 cursor-wait'
: lookupDone
? 'bg-emerald-600 hover:bg-emerald-500'
: 'bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500'
}`}
>
{isLookingUp && (
<svg className="w-4 h-4 animate-spin" viewBox="0 0 24 24" fill="none">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
)}
{isLookingUp ? 'Fetching...' : lookupDone ? 'Fetched' : 'Lookup'}
</button>
</div>
{lookupDone && (
<p className="text-xs text-emerald-400">Fields populated from wiki data. Review and edit below.</p>
)}
</div>
)}
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
{/* Basic Info */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Basic Info</h2>
<div>
<label className={labelClass}>Name (ID)</label>
<input type="text" className={inputClass} value={character.name} onChange={(e) => handleChange('name', e.target.value)} />
</div>
<div>
<label className={labelClass}>Display Name</label>
<input type="text" className={inputClass} value={character.display_name || ''} onChange={(e) => handleChange('display_name', e.target.value)} />
</div>
<div>
<label className={labelClass}>Description</label>
<input type="text" className={inputClass} value={character.description || ''} onChange={(e) => handleChange('description', e.target.value)} />
</div>
</div>
{/* TTS Configuration */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">TTS Configuration</h2>
<div>
<label className={labelClass}>Engine</label>
<select className={selectClass} value={character.tts.engine} onChange={(e) => handleNestedChange('tts', 'engine', e.target.value)}>
<option value="kokoro">Kokoro</option>
<option value="chatterbox">Chatterbox</option>
<option value="qwen3">Qwen3</option>
<option value="elevenlabs">ElevenLabs</option>
</select>
</div>
{character.tts.engine === 'elevenlabs' && (
<div className="space-y-4 border border-gray-700 p-4 rounded-lg bg-gray-800/50">
<div>
<label className="block text-xs font-medium mb-1 text-gray-500">ElevenLabs API Key (Local Use Only)</label>
<div className="flex gap-2">
<input type="password" placeholder="sk_..." className={inputClass + " text-sm"} value={elevenLabsApiKey} onChange={(e) => setElevenLabsApiKey(e.target.value)} />
<button onClick={() => fetchElevenLabsData(elevenLabsApiKey)} disabled={isLoadingElevenLabs} className="bg-indigo-600 text-white px-3 py-1 rounded-lg text-sm whitespace-nowrap hover:bg-indigo-500 disabled:opacity-50 transition-colors">
{isLoadingElevenLabs ? 'Loading...' : 'Fetch'}
</button>
</div>
</div>
<div>
<label className={labelClass}>Voice ID</label>
{elevenLabsVoices.length > 0 ? (
<select className={selectClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => {
const voiceId = e.target.value;
const voice = elevenLabsVoices.find(v => v.voice_id === voiceId);
setCharacter(prev => ({
...prev,
tts: { ...prev.tts, elevenlabs_voice_id: voiceId, elevenlabs_voice_name: voice?.name || '' }
}));
}}>
<option value="">-- Select Voice --</option>
{elevenLabsVoices.map(v => (
<option key={v.voice_id} value={v.voice_id}>{v.name} ({v.category})</option>
))}
</select>
) : (
<input type="text" className={inputClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => handleNestedChange('tts', 'elevenlabs_voice_id', e.target.value)} placeholder="e.g. 21m00Tcm4TlvDq8ikWAM" />
)}
</div>
<div>
<label className={labelClass}>Model</label>
{elevenLabsModels.length > 0 ? (
<select className={selectClass} value={character.tts.elevenlabs_model || 'eleven_monolingual_v1'} onChange={(e) => handleNestedChange('tts', 'elevenlabs_model', e.target.value)}>
<option value="">-- Select Model --</option>
{elevenLabsModels.map(m => (
<option key={m.model_id} value={m.model_id}>{m.name} ({m.model_id})</option>
))}
</select>
) : (
<input type="text" className={inputClass} value={character.tts.elevenlabs_model || 'eleven_monolingual_v1'} onChange={(e) => handleNestedChange('tts', 'elevenlabs_model', e.target.value)} placeholder="e.g. eleven_monolingual_v1" />
)}
</div>
</div>
)}
{character.tts.engine === 'kokoro' && (
<div>
<label className={labelClass}>Kokoro Voice</label>
<select className={selectClass} value={character.tts.kokoro_voice || 'af_heart'} onChange={(e) => handleNestedChange('tts', 'kokoro_voice', e.target.value)}>
<option value="af_heart">af_heart (American Female)</option>
<option value="af_alloy">af_alloy (American Female)</option>
<option value="af_aoede">af_aoede (American Female)</option>
<option value="af_bella">af_bella (American Female)</option>
<option value="af_jessica">af_jessica (American Female)</option>
<option value="af_kore">af_kore (American Female)</option>
<option value="af_nicole">af_nicole (American Female)</option>
<option value="af_nova">af_nova (American Female)</option>
<option value="af_river">af_river (American Female)</option>
<option value="af_sarah">af_sarah (American Female)</option>
<option value="af_sky">af_sky (American Female)</option>
<option value="am_adam">am_adam (American Male)</option>
<option value="am_echo">am_echo (American Male)</option>
<option value="am_eric">am_eric (American Male)</option>
<option value="am_fenrir">am_fenrir (American Male)</option>
<option value="am_liam">am_liam (American Male)</option>
<option value="am_michael">am_michael (American Male)</option>
<option value="am_onyx">am_onyx (American Male)</option>
<option value="am_puck">am_puck (American Male)</option>
<option value="am_santa">am_santa (American Male)</option>
<option value="bf_alice">bf_alice (British Female)</option>
<option value="bf_emma">bf_emma (British Female)</option>
<option value="bf_isabella">bf_isabella (British Female)</option>
<option value="bf_lily">bf_lily (British Female)</option>
<option value="bm_daniel">bm_daniel (British Male)</option>
<option value="bm_fable">bm_fable (British Male)</option>
<option value="bm_george">bm_george (British Male)</option>
<option value="bm_lewis">bm_lewis (British Male)</option>
</select>
</div>
)}
{character.tts.engine === 'chatterbox' && (
<div>
<label className={labelClass}>Voice Reference Path</label>
<input type="text" className={inputClass} value={character.tts.voice_ref_path || ''} onChange={(e) => handleNestedChange('tts', 'voice_ref_path', e.target.value)} />
</div>
)}
<div>
<label className={labelClass}>Speed: {character.tts.speed}</label>
<input type="range" min="0.5" max="2.0" step="0.1" className="w-full accent-indigo-500" value={character.tts.speed} onChange={(e) => handleNestedChange('tts', 'speed', parseFloat(e.target.value))} />
</div>
<div>
<label className={labelClass}>Preview Text</label>
<input
type="text"
className={inputClass}
value={previewText}
onChange={(e) => setPreviewText(e.target.value)}
placeholder={`Hi, I am ${character.display_name || character.name || 'your character'}. This is a preview of my voice.`}
/>
</div>
<div className="flex gap-2">
<button
onClick={previewTTS}
disabled={ttsState === 'loading'}
className={`flex-1 flex items-center justify-center gap-2 px-4 py-2 rounded-lg transition-colors ${
ttsState === 'loading'
? 'bg-indigo-800 text-indigo-300 cursor-wait'
: ttsState === 'playing'
? 'bg-emerald-600 hover:bg-emerald-500 text-white'
: 'bg-indigo-600 hover:bg-indigo-500 text-white'
}`}
>
{ttsState === 'loading' && (
<svg className="w-4 h-4 animate-spin" viewBox="0 0 24 24" fill="none">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
)}
{ttsState === 'loading' ? 'Synthesizing...' : ttsState === 'playing' ? 'Playing...' : 'Preview Voice'}
</button>
{ttsState !== 'idle' && (
<button
onClick={stopPreview}
className="px-4 py-2 bg-red-600 hover:bg-red-500 text-white rounded-lg transition-colors"
>
Stop
</button>
)}
</div>
<p className="text-xs text-gray-600">
{character.tts.engine === 'kokoro'
? 'Previews via local Kokoro TTS bridge (port 8081).'
: character.tts.engine === 'elevenlabs'
? 'Previews via ElevenLabs through bridge.'
: 'Uses browser TTS for preview. Local TTS available with Kokoro engine.'}
</p>
</div>
</div>
{/* System Prompt */}
<div className={cardClass}>
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">System Prompt</h2>
<span className="text-xs text-gray-600">{(character.system_prompt || '').length} chars</span>
</div>
<textarea
className={inputClass + " h-32 resize-y"}
value={character.system_prompt}
onChange={(e) => handleChange('system_prompt', e.target.value)}
placeholder="You are [character name]. Describe their personality, behaviour, and role..."
/>
</div>
{/* Character Profile — new v2 fields */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Character Profile</h2>
<div>
<label className={labelClass}>Background / Backstory</label>
<textarea
className={inputClass + " h-28 resize-y text-sm"}
value={character.background || ''}
onChange={(e) => handleChange('background', e.target.value)}
placeholder="Character history, origins, key life events..."
/>
</div>
<div>
<label className={labelClass}>Appearance</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.appearance || ''}
onChange={(e) => handleChange('appearance', e.target.value)}
placeholder="Physical description — also used for image generation prompts..."
/>
</div>
<div>
<label className={labelClass}>Dialogue Style & Examples</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.dialogue_style || ''}
onChange={(e) => handleChange('dialogue_style', e.target.value)}
placeholder="How the persona speaks, their tone, mannerisms, and example lines..."
/>
</div>
<div>
<label className={labelClass}>Skills & Interests</label>
<div className="flex flex-wrap gap-2 mb-2">
{(character.skills || []).map((skill, idx) => (
<span
key={idx}
className="inline-flex items-center gap-1 px-3 py-1 bg-indigo-500/20 text-indigo-300 text-sm rounded-full border border-indigo-500/30"
>
{skill.length > 80 ? skill.slice(0, 80) + '...' : skill}
<button
onClick={() => removeSkill(idx)}
className="ml-1 text-indigo-400 hover:text-red-400 transition-colors"
>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={3}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</span>
))}
</div>
<div className="flex gap-2">
<input
type="text"
className={inputClass + " text-sm"}
value={newSkill}
onChange={(e) => setNewSkill(e.target.value)}
onKeyDown={(e) => { if (e.key === 'Enter') { e.preventDefault(); addSkill(); } }}
placeholder="Add a skill or interest..."
/>
<button
onClick={addSkill}
disabled={!newSkill.trim()}
className="px-3 py-2 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors whitespace-nowrap"
>
Add
</button>
</div>
</div>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
{/* Image Generation — GAZE presets */}
<div className={cardClass}>
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">GAZE Presets</h2>
<button onClick={addGazePreset} className="flex items-center gap-1 bg-indigo-600 hover:bg-indigo-500 text-white px-3 py-1.5 rounded-lg text-sm transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Preset
</button>
</div>
<p className="text-xs text-gray-500">Image generation presets with trigger conditions. Default trigger is "self-portrait".</p>
{(!character.gaze_presets || character.gaze_presets.length === 0) ? (
<p className="text-sm text-gray-600 italic">No GAZE presets configured.</p>
) : (
<div className="space-y-3">
{character.gaze_presets.map((gp, idx) => (
<div key={idx} className="flex items-center gap-2 border border-gray-700 p-3 rounded-lg bg-gray-800/50">
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Preset</label>
{isLoadingGaze ? (
<p className="text-sm text-gray-500">Loading...</p>
) : availableGazePresets.length > 0 ? (
<select
className={selectClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
>
<option value="">-- Select --</option>
{availableGazePresets.map(p => (
<option key={p.slug} value={p.slug}>{p.name} ({p.slug})</option>
))}
</select>
) : (
<input
type="text"
className={inputClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
placeholder="Preset slug"
/>
)}
</div>
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Trigger</label>
<input
type="text"
className={inputClass + " text-sm"}
value={gp.trigger || ''}
onChange={(e) => handleGazePresetChange(idx, 'trigger', e.target.value)}
placeholder="e.g. self-portrait, battle scene"
/>
</div>
<button
onClick={() => removeGazePreset(idx)}
className="mt-5 px-2 py-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
))}
</div>
)}
</div>
{/* Model Overrides */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Model Overrides</h2>
<div>
<label className={labelClass}>Primary Model</label>
<select className={selectClass} value={character.model_overrides?.primary || 'qwen3.5:35b-a3b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}>
<option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen3.5:35b-a3b">qwen3.5:35b-a3b</option>
<option value="qwen2.5:7b">qwen2.5:7b</option>
<option value="qwen3:32b">qwen3:32b</option>
<option value="codestral:22b">codestral:22b</option>
</select>
</div>
<div>
<label className={labelClass}>Fast Model</label>
<select className={selectClass} value={character.model_overrides?.fast || 'qwen2.5:7b'} onChange={(e) => handleNestedChange('model_overrides', 'fast', e.target.value)}>
<option value="qwen2.5:7b">qwen2.5:7b</option>
<option value="qwen3.5:35b-a3b">qwen3.5:35b-a3b</option>
<option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen3:32b">qwen3:32b</option>
<option value="codestral:22b">codestral:22b</option>
</select>
</div>
</div>
</div>
{/* Custom Rules */}
<div className={cardClass}>
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">Custom Rules</h2>
<button onClick={addRule} className="flex items-center gap-1 bg-indigo-600 hover:bg-indigo-500 text-white px-3 py-1.5 rounded-lg text-sm transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Rule
</button>
</div>
{(!character.custom_rules || character.custom_rules.length === 0) ? (
<p className="text-sm text-gray-600 italic">No custom rules defined.</p>
) : (
<div className="space-y-4">
{character.custom_rules.map((rule, idx) => (
<div key={idx} className="border border-gray-700 p-4 rounded-lg relative bg-gray-800/50">
<button
onClick={() => removeRule(idx)}
className="absolute top-3 right-3 text-gray-500 hover:text-red-400 transition-colors"
title="Remove Rule"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4 mt-1">
<div>
<label className="block text-xs font-medium mb-1 text-gray-500">Trigger</label>
<input type="text" className={inputClass + " text-sm"} value={rule.trigger || ''} onChange={(e) => handleRuleChange(idx, 'trigger', e.target.value)} />
</div>
<div>
<label className="block text-xs font-medium mb-1 text-gray-500">Condition (Optional)</label>
<input type="text" className={inputClass + " text-sm"} value={rule.condition || ''} onChange={(e) => handleRuleChange(idx, 'condition', e.target.value)} placeholder="e.g. time_of_day == morning" />
</div>
<div className="md:col-span-2">
<label className="block text-xs font-medium mb-1 text-gray-500">Response</label>
<textarea className={inputClass + " text-sm h-16 resize-y"} value={rule.response || ''} onChange={(e) => handleRuleChange(idx, 'response', e.target.value)} />
</div>
</div>
</div>
))}
</div>
)}
</div>
{/* Notes */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Notes</h2>
<textarea
className={inputClass + " h-20 resize-y text-sm"}
value={character.notes || ''}
onChange={(e) => handleChange('notes', e.target.value)}
placeholder="Internal notes, reminders, or references..."
/>
</div>
</div>
);
}

View File

@@ -0,0 +1,346 @@
import { useState, useEffect, useCallback } from 'react';
import {
getPersonalMemories, savePersonalMemory, deletePersonalMemory,
getGeneralMemories, saveGeneralMemory, deleteGeneralMemory,
} from '../lib/memoryApi';
const PERSONAL_CATEGORIES = [
{ value: 'personal_info', label: 'Personal Info', color: 'bg-blue-500/20 text-blue-300 border-blue-500/30' },
{ value: 'preference', label: 'Preference', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'interaction', label: 'Interaction', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'emotional', label: 'Emotional', color: 'bg-pink-500/20 text-pink-300 border-pink-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const GENERAL_CATEGORIES = [
{ value: 'system', label: 'System', color: 'bg-indigo-500/20 text-indigo-300 border-indigo-500/30' },
{ value: 'tool_usage', label: 'Tool Usage', color: 'bg-cyan-500/20 text-cyan-300 border-cyan-500/30' },
{ value: 'home_layout', label: 'Home Layout', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'device', label: 'Device', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'routine', label: 'Routine', color: 'bg-purple-500/20 text-purple-300 border-purple-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const ACTIVE_KEY = 'homeai_active_character';
function CategoryBadge({ category, categories }) {
const cat = categories.find(c => c.value === category) || categories[categories.length - 1];
return (
<span className={`px-2 py-0.5 text-xs rounded-full border ${cat.color}`}>
{cat.label}
</span>
);
}
function MemoryCard({ memory, categories, onEdit, onDelete }) {
return (
<div className="border border-gray-700 rounded-lg p-4 bg-gray-800/50 space-y-2">
<div className="flex items-start justify-between gap-3">
<p className="text-sm text-gray-200 flex-1 whitespace-pre-wrap">{memory.content}</p>
<div className="flex gap-1 shrink-0">
<button
onClick={() => onEdit(memory)}
className="p-1.5 text-gray-500 hover:text-gray-300 transition-colors"
title="Edit"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M16.862 4.487l1.687-1.688a1.875 1.875 0 112.652 2.652L10.582 16.07a4.5 4.5 0 01-1.897 1.13L6 18l.8-2.685a4.5 4.5 0 011.13-1.897l8.932-8.931z" />
</svg>
</button>
<button
onClick={() => onDelete(memory.id)}
className="p-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Delete"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
</div>
<div className="flex items-center gap-2">
<CategoryBadge category={memory.category} categories={categories} />
<span className="text-xs text-gray-600">
{memory.createdAt ? new Date(memory.createdAt).toLocaleDateString() : ''}
</span>
</div>
</div>
);
}
function MemoryForm({ categories, editing, onSave, onCancel }) {
const [content, setContent] = useState(editing?.content || '');
const [category, setCategory] = useState(editing?.category || categories[0].value);
const handleSubmit = () => {
if (!content.trim()) return;
const memory = {
...(editing?.id ? { id: editing.id } : {}),
content: content.trim(),
category,
};
onSave(memory);
setContent('');
setCategory(categories[0].value);
};
return (
<div className="border border-indigo-500/30 rounded-lg p-4 bg-indigo-500/5 space-y-3">
<textarea
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm h-20 resize-y focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={content}
onChange={(e) => setContent(e.target.value)}
placeholder="Enter memory content..."
autoFocus
/>
<div className="flex items-center gap-3">
<select
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
value={category}
onChange={(e) => setCategory(e.target.value)}
>
{categories.map(c => (
<option key={c.value} value={c.value}>{c.label}</option>
))}
</select>
<div className="flex gap-2 ml-auto">
<button
onClick={onCancel}
className="px-3 py-1.5 bg-gray-700 hover:bg-gray-600 text-gray-300 text-sm rounded-lg transition-colors"
>
Cancel
</button>
<button
onClick={handleSubmit}
disabled={!content.trim()}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
{editing?.id ? 'Update' : 'Add Memory'}
</button>
</div>
</div>
</div>
);
}
export default function Memories() {
const [tab, setTab] = useState('personal'); // 'personal' | 'general'
const [characters, setCharacters] = useState([]);
const [selectedCharId, setSelectedCharId] = useState('');
const [memories, setMemories] = useState([]);
const [loading, setLoading] = useState(false);
const [showForm, setShowForm] = useState(false);
const [editing, setEditing] = useState(null);
const [error, setError] = useState(null);
const [filter, setFilter] = useState('');
// Load characters list
useEffect(() => {
fetch('/api/characters')
.then(r => r.json())
.then(chars => {
setCharacters(chars);
const activeId = localStorage.getItem(ACTIVE_KEY);
if (activeId && chars.some(c => c.id === activeId)) {
setSelectedCharId(activeId);
} else if (chars.length > 0) {
setSelectedCharId(chars[0].id);
}
})
.catch(() => {});
}, []);
// Load memories when tab or selected character changes
const loadMemories = useCallback(async () => {
setLoading(true);
setError(null);
try {
if (tab === 'personal' && selectedCharId) {
const data = await getPersonalMemories(selectedCharId);
setMemories(data.memories || []);
} else if (tab === 'general') {
const data = await getGeneralMemories();
setMemories(data.memories || []);
} else {
setMemories([]);
}
} catch (err) {
setError(err.message);
} finally {
setLoading(false);
}
}, [tab, selectedCharId]);
useEffect(() => { loadMemories(); }, [loadMemories]);
const handleSave = async (memory) => {
try {
if (tab === 'personal') {
await savePersonalMemory(selectedCharId, memory);
} else {
await saveGeneralMemory(memory);
}
setShowForm(false);
setEditing(null);
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleDelete = async (memoryId) => {
try {
if (tab === 'personal') {
await deletePersonalMemory(selectedCharId, memoryId);
} else {
await deleteGeneralMemory(memoryId);
}
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleEdit = (memory) => {
setEditing(memory);
setShowForm(true);
};
const categories = tab === 'personal' ? PERSONAL_CATEGORIES : GENERAL_CATEGORIES;
const filteredMemories = filter
? memories.filter(m => m.content?.toLowerCase().includes(filter.toLowerCase()) || m.category === filter)
: memories;
// Sort newest first
const sortedMemories = [...filteredMemories].sort(
(a, b) => (b.createdAt || '').localeCompare(a.createdAt || '')
);
const selectedChar = characters.find(c => c.id === selectedCharId);
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-3xl font-bold text-gray-100">Memories</h1>
<p className="text-sm text-gray-500 mt-1">
{sortedMemories.length} {tab} memor{sortedMemories.length !== 1 ? 'ies' : 'y'}
{tab === 'personal' && selectedChar && (
<span className="ml-1 text-indigo-400">
for {selectedChar.data?.display_name || selectedChar.data?.name || selectedCharId}
</span>
)}
</p>
</div>
<button
onClick={() => { setEditing(null); setShowForm(!showForm); }}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Memory
</button>
</div>
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Tabs */}
<div className="flex gap-1 bg-gray-900 p-1 rounded-lg border border-gray-800 w-fit">
<button
onClick={() => { setTab('personal'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'personal'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
Personal
</button>
<button
onClick={() => { setTab('general'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'general'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
General
</button>
</div>
{/* Character selector (personal tab only) */}
{tab === 'personal' && (
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400">Character</label>
<select
value={selectedCharId}
onChange={(e) => setSelectedCharId(e.target.value)}
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
>
{characters.map(c => (
<option key={c.id} value={c.id}>
{c.data?.display_name || c.data?.name || c.id}
</option>
))}
</select>
</div>
)}
{/* Search filter */}
<div>
<input
type="text"
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={filter}
onChange={(e) => setFilter(e.target.value)}
placeholder="Search memories..."
/>
</div>
{/* Add/Edit form */}
{showForm && (
<MemoryForm
categories={categories}
editing={editing}
onSave={handleSave}
onCancel={() => { setShowForm(false); setEditing(null); }}
/>
)}
{/* Memory list */}
{loading ? (
<div className="text-center py-12">
<p className="text-gray-500">Loading memories...</p>
</div>
) : sortedMemories.length === 0 ? (
<div className="text-center py-12">
<svg className="w-12 h-12 mx-auto text-gray-700 mb-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
<p className="text-gray-500 text-sm">
{filter ? 'No memories match your search.' : 'No memories yet. Add one to get started.'}
</p>
</div>
) : (
<div className="space-y-3">
{sortedMemories.map(memory => (
<MemoryCard
key={memory.id}
memory={memory}
categories={categories}
onEdit={handleEdit}
onDelete={handleDelete}
/>
))}
</div>
)}
</div>
);
}

View File

@@ -0,0 +1,736 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
const CHARACTERS_DIR = '/Users/aodhan/homeai-data/characters'
const SATELLITE_MAP_PATH = '/Users/aodhan/homeai-data/satellite-map.json'
const CONVERSATIONS_DIR = '/Users/aodhan/homeai-data/conversations'
const MEMORIES_DIR = '/Users/aodhan/homeai-data/memories'
const GAZE_HOST = 'http://10.0.0.101:5782'
const GAZE_API_KEY = process.env.GAZE_API_KEY || ''
function characterStoragePlugin() {
return {
name: 'character-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CHARACTERS_DIR, { recursive: true })
}
// GET /api/characters — list all profiles
server.middlewares.use('/api/characters', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
// req.url has the mount prefix stripped by connect, so "/" means /api/characters
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/characters/:id — single profile
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CHARACTERS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CHARACTERS_DIR)).filter(f => f.endsWith('.json'))
const profiles = []
for (const file of files) {
try {
const raw = await readFile(`${CHARACTERS_DIR}/${file}`, 'utf-8')
profiles.push(JSON.parse(raw))
} catch { /* skip corrupt files */ }
}
// Sort by addedAt descending
profiles.sort((a, b) => (b.addedAt || '').localeCompare(a.addedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(profiles))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const profile = JSON.parse(Buffer.concat(chunks).toString())
if (!profile.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing profile id' }))
return
}
// Sanitize filename — only allow alphanumeric, underscore, dash, dot
const safeId = profile.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CHARACTERS_DIR}/${safeId}.json`, JSON.stringify(profile, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CHARACTERS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function satelliteMapPlugin() {
return {
name: 'satellite-map',
configureServer(server) {
server.middlewares.use('/api/satellite-map', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readFile, writeFile } = await import('fs/promises')
if (req.method === 'GET') {
try {
const raw = await readFile(SATELLITE_MAP_PATH, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ default: 'aria_default', satellites: {} }))
}
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const data = JSON.parse(Buffer.concat(chunks).toString())
await writeFile(SATELLITE_MAP_PATH, JSON.stringify(data, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function conversationStoragePlugin() {
return {
name: 'conversation-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CONVERSATIONS_DIR, { recursive: true })
}
server.middlewares.use('/api/conversations', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/conversations/:id — single conversation with messages
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CONVERSATIONS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
// GET /api/conversations — list metadata (no messages)
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CONVERSATIONS_DIR)).filter(f => f.endsWith('.json'))
const list = []
for (const file of files) {
try {
const raw = await readFile(`${CONVERSATIONS_DIR}/${file}`, 'utf-8')
const conv = JSON.parse(raw)
list.push({
id: conv.id,
title: conv.title || '',
characterId: conv.characterId || '',
characterName: conv.characterName || '',
createdAt: conv.createdAt || '',
updatedAt: conv.updatedAt || '',
messageCount: (conv.messages || []).length,
})
} catch { /* skip corrupt files */ }
}
list.sort((a, b) => (b.updatedAt || '').localeCompare(a.updatedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(list))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// POST /api/conversations — create or update
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const conv = JSON.parse(Buffer.concat(chunks).toString())
if (!conv.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing conversation id' }))
return
}
const safeId = conv.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CONVERSATIONS_DIR}/${safeId}.json`, JSON.stringify(conv, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// DELETE /api/conversations/:id
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CONVERSATIONS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function healthCheckPlugin() {
return {
name: 'health-check-proxy',
configureServer(server) {
server.middlewares.use('/api/health', async (req, res) => {
const params = new URL(req.url, 'http://localhost').searchParams;
const url = params.get('url');
const mode = params.get('mode'); // 'tcp' for raw TCP port check
if (!url) {
res.writeHead(400, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ error: 'Missing url param' }));
return;
}
const start = Date.now();
const parsedUrl = new URL(url);
try {
if (mode === 'tcp') {
// TCP socket connect check for non-HTTP services (e.g. Wyoming)
const { default: net } = await import('net');
await new Promise((resolve, reject) => {
const socket = net.createConnection(
{ host: parsedUrl.hostname, port: parseInt(parsedUrl.port), timeout: 5000 },
() => { socket.destroy(); resolve(); }
);
socket.on('error', reject);
socket.on('timeout', () => { socket.destroy(); reject(new Error('timeout')); });
});
} else {
// HTTP/HTTPS health check
const { default: https } = await import('https');
const { default: http } = await import('http');
const client = parsedUrl.protocol === 'https:' ? https : http;
await new Promise((resolve, reject) => {
const reqObj = client.get(url, { rejectUnauthorized: false, timeout: 5000 }, (resp) => {
resp.resume();
resolve();
});
reqObj.on('error', reject);
reqObj.on('timeout', () => { reqObj.destroy(); reject(new Error('timeout')); });
});
}
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ status: 'online', responseTime: Date.now() - start }));
} catch {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ status: 'offline', responseTime: null }));
}
});
// Service restart — runs launchctl or docker restart
server.middlewares.use('/api/service/restart', async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'POST', 'Access-Control-Allow-Headers': 'Content-Type' });
res.end();
return;
}
if (req.method !== 'POST') {
res.writeHead(405);
res.end();
return;
}
try {
const chunks = [];
for await (const chunk of req) chunks.push(chunk);
const { type, id } = JSON.parse(Buffer.concat(chunks).toString());
if (!type || !id) {
res.writeHead(400, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ ok: false, error: 'Missing type or id' }));
return;
}
// Whitelist valid service IDs to prevent command injection
const ALLOWED_LAUNCHD = [
'gui/501/com.homeai.ollama',
'gui/501/com.homeai.openclaw',
'gui/501/com.homeai.openclaw-bridge',
'gui/501/com.homeai.wyoming-stt',
'gui/501/com.homeai.wyoming-tts',
'gui/501/com.homeai.wyoming-satellite',
'gui/501/com.homeai.dashboard',
];
const ALLOWED_DOCKER = [
'homeai-open-webui',
'homeai-uptime-kuma',
'homeai-n8n',
'homeai-code-server',
];
let cmd;
if (type === 'launchd' && ALLOWED_LAUNCHD.includes(id)) {
cmd = ['launchctl', 'kickstart', '-k', id];
} else if (type === 'docker' && ALLOWED_DOCKER.includes(id)) {
cmd = ['docker', 'restart', id];
} else {
res.writeHead(403, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ ok: false, error: 'Service not in allowed list' }));
return;
}
const { execFile } = await import('child_process');
const { promisify } = await import('util');
const execFileAsync = promisify(execFile);
const { stdout, stderr } = await execFileAsync(cmd[0], cmd.slice(1), { timeout: 30000 });
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ ok: true, stdout: stdout.trim(), stderr: stderr.trim() }));
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ ok: false, error: err.message }));
}
});
},
};
}
function gazeProxyPlugin() {
return {
name: 'gaze-proxy',
configureServer(server) {
server.middlewares.use('/api/gaze/presets', async (req, res) => {
if (!GAZE_API_KEY) {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
return
}
try {
const http = await import('http')
const url = new URL(`${GAZE_HOST}/api/v1/presets`)
const proxyRes = await new Promise((resolve, reject) => {
const r = http.default.get(url, { headers: { 'X-API-Key': GAZE_API_KEY }, timeout: 5000 }, resolve)
r.on('error', reject)
r.on('timeout', () => { r.destroy(); reject(new Error('timeout')) })
})
const chunks = []
for await (const chunk of proxyRes) chunks.push(chunk)
res.writeHead(proxyRes.statusCode, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(Buffer.concat(chunks))
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
}
})
},
}
}
function memoryStoragePlugin() {
return {
name: 'memory-storage',
configureServer(server) {
const ensureDirs = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(`${MEMORIES_DIR}/personal`, { recursive: true })
}
const readJsonFile = async (path, fallback) => {
const { readFile } = await import('fs/promises')
try {
return JSON.parse(await readFile(path, 'utf-8'))
} catch {
return fallback
}
}
const writeJsonFile = async (path, data) => {
const { writeFile } = await import('fs/promises')
await writeFile(path, JSON.stringify(data, null, 2))
}
// Personal memories: /api/memories/personal/:characterId[/:memoryId]
server.middlewares.use('/api/memories/personal', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const parts = url.pathname.replace(/^\/+/, '').split('/')
const characterId = parts[0] ? parts[0].replace(/[^a-zA-Z0-9_\-\.]/g, '_') : null
const memoryId = parts[1] || null
if (!characterId) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing character ID' }))
return
}
const filePath = `${MEMORIES_DIR}/personal/${characterId}.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { characterId, memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { characterId, memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { characterId, memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
// General memories: /api/memories/general[/:memoryId]
server.middlewares.use('/api/memories/general', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const memoryId = url.pathname.replace(/^\/+/, '') || null
const filePath = `${MEMORIES_DIR}/general.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function characterLookupPlugin() {
return {
name: 'character-lookup',
configureServer(server) {
server.middlewares.use('/api/character-lookup', async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
if (req.method !== 'POST') {
res.writeHead(405, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'POST only' }))
return
}
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const { name, franchise } = JSON.parse(Buffer.concat(chunks).toString())
if (!name || !franchise) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing name or franchise' }))
return
}
const { execFile } = await import('child_process')
const { promisify } = await import('util')
const execFileAsync = promisify(execFile)
// Call the MCP fetcher inside the running Docker container
const safeName = name.replace(/'/g, "\\'")
const safeFranchise = franchise.replace(/'/g, "\\'")
const pyScript = `
import asyncio, json
from character_details.fetcher import fetch_character
c = asyncio.run(fetch_character('${safeName}', '${safeFranchise}'))
print(json.dumps(c.model_dump(), default=str))
`.trim()
const { stdout } = await execFileAsync(
'docker',
['exec', 'character-browser-character-mcp-1', 'python', '-c', pyScript],
{ timeout: 30000 }
)
const data = JSON.parse(stdout.trim())
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({
name: data.name || name,
franchise: data.franchise || franchise,
description: data.description || '',
background: data.background || '',
appearance: data.appearance || '',
personality: data.personality || '',
abilities: data.abilities || [],
notable_quotes: data.notable_quotes || [],
relationships: data.relationships || [],
sources: data.sources || [],
}))
} catch (err) {
console.error('[character-lookup] failed:', err?.message || err)
const status = err?.message?.includes('timeout') ? 504 : 500
res.writeHead(status, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: err?.message || 'Lookup failed' }))
}
})
},
}
}
function bridgeProxyPlugin() {
return {
name: 'bridge-proxy',
configureServer(server) {
// Proxy a request to the OpenClaw bridge
const proxyRequest = (targetPath) => async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, GET, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
})
res.end()
return
}
try {
const { default: http } = await import('http')
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const body = Buffer.concat(chunks)
await new Promise((resolve, reject) => {
const proxyReq = http.request(
`http://localhost:8081${targetPath}`,
{
method: req.method,
headers: {
'Content-Type': req.headers['content-type'] || 'application/json',
'Content-Length': body.length,
},
timeout: 120000,
},
(proxyRes) => {
res.writeHead(proxyRes.statusCode, {
'Content-Type': proxyRes.headers['content-type'] || 'application/json',
'Access-Control-Allow-Origin': '*',
})
proxyRes.pipe(res)
proxyRes.on('end', resolve)
proxyRes.on('error', resolve)
}
)
proxyReq.on('error', reject)
proxyReq.on('timeout', () => {
proxyReq.destroy()
reject(new Error('timeout'))
})
proxyReq.write(body)
proxyReq.end()
})
} catch (err) {
console.error(`[bridge-proxy] ${targetPath} failed:`, err?.message || err)
if (!res.headersSent) {
res.writeHead(502, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: `Bridge unreachable: ${err?.message || 'unknown'}` }))
}
}
}
server.middlewares.use('/api/agent/message', proxyRequest('/api/agent/message'))
server.middlewares.use('/api/tts', proxyRequest('/api/tts'))
server.middlewares.use('/api/stt', proxyRequest('/api/stt'))
},
}
}
export default defineConfig({
plugins: [
characterStoragePlugin(),
satelliteMapPlugin(),
conversationStoragePlugin(),
memoryStoragePlugin(),
gazeProxyPlugin(),
characterLookupPlugin(),
healthCheckPlugin(),
bridgeProxyPlugin(),
tailwindcss(),
react(),
],
server: {
host: '0.0.0.0',
port: 5173,
},
})

2
homeai-desktop/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
node_modules/
dist/

15
homeai-desktop/index.html Normal file
View File

@@ -0,0 +1,15 @@
<!doctype html>
<html lang="en" class="dark">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/icon.svg" />
<link rel="manifest" href="/manifest.json" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="theme-color" content="#030712" />
<title>HomeAI Assistant</title>
</head>
<body class="bg-gray-950 text-gray-100">
<div id="root"></div>
<script type="module" src="/src/main.jsx"></script>
</body>
</html>

View File

@@ -0,0 +1,31 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.desktop-assistant</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/npx</string>
<string>vite</string>
<string>--host</string>
<string>--port</string>
<string>5174</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/aodhan/gitea/homeai/homeai-desktop</string>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-desktop-assistant.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-desktop-assistant-error.log</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
</dict>
</dict>
</plist>

2117
homeai-desktop/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,24 @@
{
"name": "homeai-desktop",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"dependencies": {
"@tailwindcss/vite": "^4.2.1",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"tailwindcss": "^4.2.1"
},
"devDependencies": {
"@vitejs/plugin-react": "^5.1.1",
"vite": "^8.0.0-beta.13"
},
"overrides": {
"vite": "^8.0.0-beta.13"
}
}

View File

@@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64">
<rect width="64" height="64" rx="14" fill="#030712"/>
<circle cx="32" cy="28" r="12" fill="none" stroke="#818cf8" stroke-width="2.5"/>
<path d="M26 26c0-3.3 2.7-6 6-6s6 2.7 6 6" fill="none" stroke="#818cf8" stroke-width="2" stroke-linecap="round"/>
<rect x="30" y="40" width="4" height="8" rx="2" fill="#818cf8"/>
<path d="M24 52h16" stroke="#818cf8" stroke-width="2.5" stroke-linecap="round"/>
<circle cx="29" cy="27" r="1.5" fill="#34d399"/>
<circle cx="35" cy="27" r="1.5" fill="#34d399"/>
</svg>

After

Width:  |  Height:  |  Size: 575 B

View File

@@ -0,0 +1,16 @@
{
"name": "HomeAI Assistant",
"short_name": "HomeAI",
"description": "Desktop AI assistant powered by local LLMs",
"start_url": "/",
"display": "standalone",
"background_color": "#030712",
"theme_color": "#030712",
"icons": [
{
"src": "/icon.svg",
"sizes": "any",
"type": "image/svg+xml"
}
]
}

115
homeai-desktop/src/App.jsx Normal file
View File

@@ -0,0 +1,115 @@
import { useState, useEffect, useCallback } from 'react'
import ChatPanel from './components/ChatPanel'
import InputBar from './components/InputBar'
import StatusIndicator from './components/StatusIndicator'
import SettingsDrawer from './components/SettingsDrawer'
import { useSettings } from './hooks/useSettings'
import { useBridgeHealth } from './hooks/useBridgeHealth'
import { useChat } from './hooks/useChat'
import { useTtsPlayback } from './hooks/useTtsPlayback'
import { useVoiceInput } from './hooks/useVoiceInput'
export default function App() {
const { settings, updateSetting } = useSettings()
const isOnline = useBridgeHealth()
const { messages, isLoading, send, clearHistory } = useChat()
const { isPlaying, speak, stop } = useTtsPlayback(settings.voice)
const { isRecording, isTranscribing, startRecording, stopRecording } = useVoiceInput(settings.sttMode)
const [settingsOpen, setSettingsOpen] = useState(false)
// Send a message and optionally speak the response
const handleSend = useCallback(async (text) => {
const response = await send(text)
if (response && settings.autoTts) {
speak(response)
}
}, [send, settings.autoTts, speak])
// Toggle voice recording
const handleVoiceToggle = useCallback(async () => {
if (isRecording) {
const text = await stopRecording()
if (text) {
handleSend(text)
}
} else {
startRecording()
}
}, [isRecording, stopRecording, startRecording, handleSend])
// Space bar push-to-talk when input not focused
useEffect(() => {
const handleKeyDown = (e) => {
if (e.code === 'Space' && e.target.tagName !== 'TEXTAREA' && e.target.tagName !== 'INPUT') {
e.preventDefault()
handleVoiceToggle()
}
}
window.addEventListener('keydown', handleKeyDown)
return () => window.removeEventListener('keydown', handleKeyDown)
}, [handleVoiceToggle])
return (
<div className="h-screen flex flex-col bg-gray-950">
{/* Status bar */}
<header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50">
<div className="flex items-center gap-2">
<StatusIndicator isOnline={isOnline} />
<span className="text-xs text-gray-500">
{isOnline === null ? 'Connecting...' : isOnline ? 'Connected' : 'Offline'}
</span>
</div>
<div className="flex items-center gap-2">
{messages.length > 0 && (
<button
onClick={clearHistory}
className="text-xs text-gray-500 hover:text-gray-300 transition-colors px-2 py-1"
title="Clear conversation"
>
Clear
</button>
)}
{isPlaying && (
<button
onClick={stop}
className="text-xs text-indigo-400 hover:text-indigo-300 transition-colors px-2 py-1"
title="Stop speaking"
>
Stop audio
</button>
)}
<button
onClick={() => setSettingsOpen(true)}
className="text-gray-500 hover:text-gray-300 transition-colors p-1"
title="Settings"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.325.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.241-.438.613-.43.992a7.723 7.723 0 010 .255c-.008.378.137.75.43.991l1.004.827c.424.35.534.955.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.47 6.47 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.281c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.019-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.991a6.932 6.932 0 010-.255c.007-.38-.138-.751-.43-.992l-1.004-.827a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.086.22-.128.332-.183.582-.495.644-.869l.214-1.28z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
</button>
</div>
</header>
{/* Chat area */}
<ChatPanel messages={messages} isLoading={isLoading} onReplay={speak} />
{/* Input */}
<InputBar
onSend={handleSend}
onVoiceToggle={handleVoiceToggle}
isLoading={isLoading}
isRecording={isRecording}
isTranscribing={isTranscribing}
/>
{/* Settings drawer */}
<SettingsDrawer
isOpen={settingsOpen}
onClose={() => setSettingsOpen(false)}
settings={settings}
onUpdate={updateSetting}
/>
</div>
)
}

View File

@@ -0,0 +1,35 @@
import { useEffect, useRef } from 'react'
import MessageBubble from './MessageBubble'
import ThinkingIndicator from './ThinkingIndicator'
export default function ChatPanel({ messages, isLoading, onReplay }) {
const bottomRef = useRef(null)
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' })
}, [messages, isLoading])
if (messages.length === 0 && !isLoading) {
return (
<div className="flex-1 flex items-center justify-center">
<div className="text-center">
<div className="w-16 h-16 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4">
<span className="text-indigo-400 text-2xl">AI</span>
</div>
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm Aria</h2>
<p className="text-gray-500 text-sm">Type a message or press the mic to talk</p>
</div>
</div>
)
}
return (
<div className="flex-1 overflow-y-auto py-4">
{messages.map((msg) => (
<MessageBubble key={msg.id} message={msg} onReplay={onReplay} />
))}
{isLoading && <ThinkingIndicator />}
<div ref={bottomRef} />
</div>
)
}

View File

@@ -0,0 +1,53 @@
import { useState, useRef } from 'react'
import VoiceButton from './VoiceButton'
export default function InputBar({ onSend, onVoiceToggle, isLoading, isRecording, isTranscribing }) {
const [text, setText] = useState('')
const inputRef = useRef(null)
const handleSubmit = (e) => {
e.preventDefault()
if (!text.trim() || isLoading) return
onSend(text)
setText('')
}
const handleKeyDown = (e) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
handleSubmit(e)
}
}
return (
<form onSubmit={handleSubmit} className="border-t border-gray-800 bg-gray-950 px-4 py-3">
<div className="flex items-end gap-2 max-w-3xl mx-auto">
<VoiceButton
isRecording={isRecording}
isTranscribing={isTranscribing}
onToggle={onVoiceToggle}
disabled={isLoading}
/>
<textarea
ref={inputRef}
value={text}
onChange={(e) => setText(e.target.value)}
onKeyDown={handleKeyDown}
placeholder="Type a message..."
rows={1}
className="flex-1 bg-gray-800 text-gray-100 rounded-xl px-4 py-2.5 text-sm resize-none placeholder-gray-500 focus:outline-none focus:ring-1 focus:ring-indigo-500 min-h-[42px] max-h-32"
disabled={isLoading}
/>
<button
type="submit"
disabled={!text.trim() || isLoading}
className="w-10 h-10 rounded-full bg-indigo-600 text-white flex items-center justify-center shrink-0 hover:bg-indigo-500 disabled:opacity-40 disabled:hover:bg-indigo-600 transition-colors"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 12L3.269 3.126A59.768 59.768 0 0121.485 12 59.77 59.77 0 013.27 20.876L5.999 12zm0 0h7.5" />
</svg>
</button>
</div>
</form>
)
}

View File

@@ -0,0 +1,39 @@
export default function MessageBubble({ message, onReplay }) {
const isUser = message.role === 'user'
return (
<div className={`flex ${isUser ? 'justify-end' : 'justify-start'} px-4 py-1.5`}>
<div className={`flex items-start gap-3 max-w-[80%] ${isUser ? 'flex-row-reverse' : ''}`}>
{!isUser && (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">AI</span>
</div>
)}
<div>
<div
className={`rounded-2xl px-4 py-2.5 text-sm leading-relaxed whitespace-pre-wrap ${
isUser
? 'bg-indigo-600 text-white'
: message.isError
? 'bg-red-900/40 text-red-200 border border-red-800/50'
: 'bg-gray-800 text-gray-100'
}`}
>
{message.content}
</div>
{!isUser && !message.isError && onReplay && (
<button
onClick={() => onReplay(message.content)}
className="mt-1 ml-1 text-gray-500 hover:text-indigo-400 transition-colors"
title="Replay audio"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.536 8.464a5 5 0 010 7.072M17.95 6.05a8 8 0 010 11.9M6.5 9H4a1 1 0 00-1 1v4a1 1 0 001 1h2.5l4 4V5l-4 4z" />
</svg>
</button>
)}
</div>
</div>
</div>
)
}

View File

@@ -0,0 +1,74 @@
import { VOICES } from '../lib/constants'
export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate }) {
if (!isOpen) return null
return (
<>
<div className="fixed inset-0 bg-black/50 z-40" onClick={onClose} />
<div className="fixed right-0 top-0 bottom-0 w-80 bg-gray-900 border-l border-gray-800 z-50 flex flex-col">
<div className="flex items-center justify-between px-4 py-3 border-b border-gray-800">
<h2 className="text-sm font-medium text-gray-200">Settings</h2>
<button onClick={onClose} className="text-gray-500 hover:text-gray-300">
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
<div className="flex-1 overflow-y-auto p-4 space-y-5">
{/* Voice */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Voice</label>
<select
value={settings.voice}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{VOICES.map((v) => (
<option key={v.id} value={v.id}>{v.label}</option>
))}
</select>
</div>
{/* Auto TTS */}
<div className="flex items-center justify-between">
<div>
<div className="text-sm text-gray-200">Auto-speak responses</div>
<div className="text-xs text-gray-500">Speak assistant replies aloud</div>
</div>
<button
onClick={() => onUpdate('autoTts', !settings.autoTts)}
className={`relative w-10 h-6 rounded-full transition-colors ${
settings.autoTts ? 'bg-indigo-600' : 'bg-gray-700'
}`}
>
<span
className={`absolute top-0.5 left-0.5 w-5 h-5 rounded-full bg-white transition-transform ${
settings.autoTts ? 'translate-x-4' : ''
}`}
/>
</button>
</div>
{/* STT Mode */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Speech recognition</label>
<select
value={settings.sttMode}
onChange={(e) => onUpdate('sttMode', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="bridge">Wyoming STT (local)</option>
<option value="webspeech">Web Speech API (browser)</option>
</select>
<p className="text-xs text-gray-500 mt-1">
{settings.sttMode === 'bridge'
? 'Uses Whisper via the local bridge server'
: 'Uses browser built-in speech recognition'}
</p>
</div>
</div>
</div>
</>
)
}

View File

@@ -0,0 +1,11 @@
export default function StatusIndicator({ isOnline }) {
if (isOnline === null) {
return <span className="inline-block w-2.5 h-2.5 rounded-full bg-gray-500 animate-pulse" title="Checking..." />
}
return (
<span
className={`inline-block w-2.5 h-2.5 rounded-full ${isOnline ? 'bg-emerald-400' : 'bg-red-400'}`}
title={isOnline ? 'Bridge online' : 'Bridge offline'}
/>
)
}

View File

@@ -0,0 +1,14 @@
export default function ThinkingIndicator() {
return (
<div className="flex items-start gap-3 px-4 py-3">
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0">
<span className="text-indigo-400 text-sm">AI</span>
</div>
<div className="flex items-center gap-1 pt-2.5">
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.2s_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.4s_infinite]" />
</div>
</div>
)
}

View File

@@ -0,0 +1,32 @@
export default function VoiceButton({ isRecording, isTranscribing, onToggle, disabled }) {
const handleClick = () => {
if (disabled || isTranscribing) return
onToggle()
}
return (
<button
onClick={handleClick}
disabled={disabled || isTranscribing}
className={`w-10 h-10 rounded-full flex items-center justify-center transition-all shrink-0 ${
isRecording
? 'bg-red-500 text-white shadow-[0_0_0_4px_rgba(239,68,68,0.3)] animate-pulse'
: isTranscribing
? 'bg-gray-700 text-gray-400 cursor-wait'
: 'bg-gray-800 text-gray-400 hover:bg-gray-700 hover:text-gray-200'
}`}
title={isRecording ? 'Stop recording' : isTranscribing ? 'Transcribing...' : 'Start recording (Space)'}
>
{isTranscribing ? (
<svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
) : (
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18.75a6 6 0 006-6v-1.5m-6 7.5a6 6 0 01-6-6v-1.5m6 7.5v3.75m-3.75 0h7.5M12 15.75a3 3 0 01-3-3V4.5a3 3 0 116 0v8.25a3 3 0 01-3 3z" />
</svg>
)}
</button>
)
}

View File

@@ -0,0 +1,18 @@
import { useState, useEffect, useRef } from 'react'
import { healthCheck } from '../lib/api'
export function useBridgeHealth() {
const [isOnline, setIsOnline] = useState(null)
const intervalRef = useRef(null)
useEffect(() => {
const check = async () => {
setIsOnline(await healthCheck())
}
check()
intervalRef.current = setInterval(check, 15000)
return () => clearInterval(intervalRef.current)
}, [])
return isOnline
}

View File

@@ -0,0 +1,45 @@
import { useState, useCallback } from 'react'
import { sendMessage } from '../lib/api'
export function useChat() {
const [messages, setMessages] = useState([])
const [isLoading, setIsLoading] = useState(false)
const send = useCallback(async (text) => {
if (!text.trim() || isLoading) return null
const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date() }
setMessages((prev) => [...prev, userMsg])
setIsLoading(true)
try {
const response = await sendMessage(text.trim())
const assistantMsg = {
id: Date.now() + 1,
role: 'assistant',
content: response,
timestamp: new Date(),
}
setMessages((prev) => [...prev, assistantMsg])
return response
} catch (err) {
const errorMsg = {
id: Date.now() + 1,
role: 'assistant',
content: `Error: ${err.message}`,
timestamp: new Date(),
isError: true,
}
setMessages((prev) => [...prev, errorMsg])
return null
} finally {
setIsLoading(false)
}
}, [isLoading])
const clearHistory = useCallback(() => {
setMessages([])
}, [])
return { messages, isLoading, send, clearHistory }
}

View File

@@ -0,0 +1,27 @@
import { useState, useCallback } from 'react'
import { DEFAULT_SETTINGS } from '../lib/constants'
const STORAGE_KEY = 'homeai_desktop_settings'
function loadSettings() {
try {
const stored = localStorage.getItem(STORAGE_KEY)
return stored ? { ...DEFAULT_SETTINGS, ...JSON.parse(stored) } : { ...DEFAULT_SETTINGS }
} catch {
return { ...DEFAULT_SETTINGS }
}
}
export function useSettings() {
const [settings, setSettings] = useState(loadSettings)
const updateSetting = useCallback((key, value) => {
setSettings((prev) => {
const next = { ...prev, [key]: value }
localStorage.setItem(STORAGE_KEY, JSON.stringify(next))
return next
})
}, [])
return { settings, updateSetting }
}

View File

@@ -0,0 +1,56 @@
import { useState, useRef, useCallback } from 'react'
import { synthesize } from '../lib/api'
export function useTtsPlayback(voice) {
const [isPlaying, setIsPlaying] = useState(false)
const audioCtxRef = useRef(null)
const sourceRef = useRef(null)
const getAudioContext = () => {
if (!audioCtxRef.current || audioCtxRef.current.state === 'closed') {
audioCtxRef.current = new AudioContext()
}
return audioCtxRef.current
}
const speak = useCallback(async (text) => {
if (!text) return
// Stop any current playback
if (sourceRef.current) {
try { sourceRef.current.stop() } catch {}
}
setIsPlaying(true)
try {
const audioData = await synthesize(text, voice)
const ctx = getAudioContext()
if (ctx.state === 'suspended') await ctx.resume()
const audioBuffer = await ctx.decodeAudioData(audioData)
const source = ctx.createBufferSource()
source.buffer = audioBuffer
source.connect(ctx.destination)
sourceRef.current = source
source.onended = () => {
setIsPlaying(false)
sourceRef.current = null
}
source.start()
} catch (err) {
console.error('TTS playback error:', err)
setIsPlaying(false)
}
}, [voice])
const stop = useCallback(() => {
if (sourceRef.current) {
try { sourceRef.current.stop() } catch {}
sourceRef.current = null
}
setIsPlaying(false)
}, [])
return { isPlaying, speak, stop }
}

View File

@@ -0,0 +1,93 @@
import { useState, useRef, useCallback } from 'react'
import { createRecorder } from '../lib/audio'
import { transcribe } from '../lib/api'
export function useVoiceInput(sttMode = 'bridge') {
const [isRecording, setIsRecording] = useState(false)
const [isTranscribing, setIsTranscribing] = useState(false)
const recorderRef = useRef(null)
const webSpeechRef = useRef(null)
const startRecording = useCallback(async () => {
if (isRecording) return
if (sttMode === 'webspeech' && 'webkitSpeechRecognition' in window) {
return startWebSpeech()
}
try {
const recorder = createRecorder()
recorderRef.current = recorder
await recorder.start()
setIsRecording(true)
} catch (err) {
console.error('Mic access error:', err)
}
}, [isRecording, sttMode])
const stopRecording = useCallback(async () => {
if (!isRecording) return null
if (sttMode === 'webspeech' && webSpeechRef.current) {
return stopWebSpeech()
}
setIsRecording(false)
setIsTranscribing(true)
try {
const wavBlob = await recorderRef.current.stop()
recorderRef.current = null
const text = await transcribe(wavBlob)
return text
} catch (err) {
console.error('Transcription error:', err)
return null
} finally {
setIsTranscribing(false)
}
}, [isRecording, sttMode])
// Web Speech API fallback
function startWebSpeech() {
return new Promise((resolve) => {
const SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition
const recognition = new SpeechRecognition()
recognition.continuous = false
recognition.interimResults = false
recognition.lang = 'en-US'
webSpeechRef.current = { recognition, resolve: null }
recognition.start()
setIsRecording(true)
resolve()
})
}
function stopWebSpeech() {
return new Promise((resolve) => {
const { recognition } = webSpeechRef.current
recognition.onresult = (e) => {
const text = e.results[0]?.[0]?.transcript || ''
setIsRecording(false)
webSpeechRef.current = null
resolve(text)
}
recognition.onerror = () => {
setIsRecording(false)
webSpeechRef.current = null
resolve(null)
}
recognition.onend = () => {
// If no result event fired, resolve with null
setIsRecording(false)
if (webSpeechRef.current) {
webSpeechRef.current = null
resolve(null)
}
}
recognition.stop()
})
}
return { isRecording, isTranscribing, startRecording, stopRecording }
}

View File

@@ -0,0 +1 @@
@import "tailwindcss";

View File

@@ -0,0 +1,44 @@
export async function sendMessage(text) {
const res = await fetch('/api/agent/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: text, agent: 'main' }),
})
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Request failed' }))
throw new Error(err.error || `HTTP ${res.status}`)
}
const data = await res.json()
return data.response
}
export async function synthesize(text, voice) {
const res = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice }),
})
if (!res.ok) throw new Error('TTS failed')
return await res.arrayBuffer()
}
export async function transcribe(wavBlob) {
const res = await fetch('/api/stt', {
method: 'POST',
headers: { 'Content-Type': 'audio/wav' },
body: wavBlob,
})
if (!res.ok) throw new Error('STT failed')
const data = await res.json()
return data.text
}
export async function healthCheck() {
try {
const res = await fetch('/api/health', { signal: AbortSignal.timeout(5000) })
const data = await res.json()
return data.status === 'online'
} catch {
return false
}
}

View File

@@ -0,0 +1,103 @@
const TARGET_RATE = 16000
/**
* Record audio from the microphone, returning a WAV blob when stopped.
* Returns { start, stop } — call start() to begin, stop() resolves with a Blob.
*/
export function createRecorder() {
let audioCtx
let source
let processor
let stream
let samples = []
async function start() {
samples = []
stream = await navigator.mediaDevices.getUserMedia({
audio: { channelCount: 1, sampleRate: TARGET_RATE },
})
audioCtx = new AudioContext({ sampleRate: TARGET_RATE })
source = audioCtx.createMediaStreamSource(stream)
// ScriptProcessorNode captures raw Float32 PCM
processor = audioCtx.createScriptProcessor(4096, 1, 1)
processor.onaudioprocess = (e) => {
const input = e.inputBuffer.getChannelData(0)
samples.push(new Float32Array(input))
}
source.connect(processor)
processor.connect(audioCtx.destination)
}
async function stop() {
// Stop everything
processor.disconnect()
source.disconnect()
stream.getTracks().forEach((t) => t.stop())
await audioCtx.close()
// Merge all sample chunks
const totalLength = samples.reduce((acc, s) => acc + s.length, 0)
const merged = new Float32Array(totalLength)
let offset = 0
for (const chunk of samples) {
merged.set(chunk, offset)
offset += chunk.length
}
// Resample if the actual sample rate differs from target
const resampled = audioCtx.sampleRate !== TARGET_RATE
? resample(merged, audioCtx.sampleRate, TARGET_RATE)
: merged
// Convert to 16-bit PCM WAV
return encodeWav(resampled, TARGET_RATE)
}
return { start, stop }
}
function resample(samples, fromRate, toRate) {
const ratio = fromRate / toRate
const newLength = Math.round(samples.length / ratio)
const result = new Float32Array(newLength)
for (let i = 0; i < newLength; i++) {
result[i] = samples[Math.round(i * ratio)]
}
return result
}
function encodeWav(samples, sampleRate) {
const numSamples = samples.length
const buffer = new ArrayBuffer(44 + numSamples * 2)
const view = new DataView(buffer)
// WAV header
writeString(view, 0, 'RIFF')
view.setUint32(4, 36 + numSamples * 2, true)
writeString(view, 8, 'WAVE')
writeString(view, 12, 'fmt ')
view.setUint32(16, 16, true) // chunk size
view.setUint16(20, 1, true) // PCM
view.setUint16(22, 1, true) // mono
view.setUint32(24, sampleRate, true)
view.setUint32(28, sampleRate * 2, true) // byte rate
view.setUint16(32, 2, true) // block align
view.setUint16(34, 16, true) // bits per sample
writeString(view, 36, 'data')
view.setUint32(40, numSamples * 2, true)
// PCM data — clamp Float32 to Int16
for (let i = 0; i < numSamples; i++) {
const s = Math.max(-1, Math.min(1, samples[i]))
view.setInt16(44 + i * 2, s < 0 ? s * 0x8000 : s * 0x7fff, true)
}
return new Blob([buffer], { type: 'audio/wav' })
}
function writeString(view, offset, str) {
for (let i = 0; i < str.length; i++) {
view.setUint8(offset + i, str.charCodeAt(i))
}
}

View File

@@ -0,0 +1,37 @@
export const DEFAULT_VOICE = 'af_heart'
export const VOICES = [
{ id: 'af_heart', label: 'Heart (F, US)' },
{ id: 'af_alloy', label: 'Alloy (F, US)' },
{ id: 'af_aoede', label: 'Aoede (F, US)' },
{ id: 'af_bella', label: 'Bella (F, US)' },
{ id: 'af_jessica', label: 'Jessica (F, US)' },
{ id: 'af_kore', label: 'Kore (F, US)' },
{ id: 'af_nicole', label: 'Nicole (F, US)' },
{ id: 'af_nova', label: 'Nova (F, US)' },
{ id: 'af_river', label: 'River (F, US)' },
{ id: 'af_sarah', label: 'Sarah (F, US)' },
{ id: 'af_sky', label: 'Sky (F, US)' },
{ id: 'am_adam', label: 'Adam (M, US)' },
{ id: 'am_echo', label: 'Echo (M, US)' },
{ id: 'am_eric', label: 'Eric (M, US)' },
{ id: 'am_fenrir', label: 'Fenrir (M, US)' },
{ id: 'am_liam', label: 'Liam (M, US)' },
{ id: 'am_michael', label: 'Michael (M, US)' },
{ id: 'am_onyx', label: 'Onyx (M, US)' },
{ id: 'am_puck', label: 'Puck (M, US)' },
{ id: 'bf_alice', label: 'Alice (F, UK)' },
{ id: 'bf_emma', label: 'Emma (F, UK)' },
{ id: 'bf_isabella', label: 'Isabella (F, UK)' },
{ id: 'bf_lily', label: 'Lily (F, UK)' },
{ id: 'bm_daniel', label: 'Daniel (M, UK)' },
{ id: 'bm_fable', label: 'Fable (M, UK)' },
{ id: 'bm_george', label: 'George (M, UK)' },
{ id: 'bm_lewis', label: 'Lewis (M, UK)' },
]
export const DEFAULT_SETTINGS = {
voice: DEFAULT_VOICE,
autoTts: true,
sttMode: 'bridge', // 'bridge' or 'webspeech'
}

View File

@@ -0,0 +1,10 @@
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App.jsx'
createRoot(document.getElementById('root')).render(
<StrictMode>
<App />
</StrictMode>,
)

View File

@@ -0,0 +1,102 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
function bridgeProxyPlugin() {
return {
name: 'bridge-proxy',
configureServer(server) {
// Proxy a request to the OpenClaw bridge
const proxyRequest = (targetPath) => async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, GET, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
})
res.end()
return
}
try {
const { default: http } = await import('http')
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const body = Buffer.concat(chunks)
await new Promise((resolve, reject) => {
const proxyReq = http.request(
`http://localhost:8081${targetPath}`,
{
method: req.method,
headers: {
'Content-Type': req.headers['content-type'] || 'application/json',
'Content-Length': body.length,
},
timeout: 120000,
},
(proxyRes) => {
res.writeHead(proxyRes.statusCode, {
'Content-Type': proxyRes.headers['content-type'] || 'application/json',
'Access-Control-Allow-Origin': '*',
})
proxyRes.pipe(res)
proxyRes.on('end', resolve)
proxyRes.on('error', resolve)
}
)
proxyReq.on('error', reject)
proxyReq.on('timeout', () => {
proxyReq.destroy()
reject(new Error('timeout'))
})
proxyReq.write(body)
proxyReq.end()
})
} catch {
if (!res.headersSent) {
res.writeHead(502, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Bridge unreachable' }))
}
}
}
server.middlewares.use('/api/agent/message', proxyRequest('/api/agent/message'))
server.middlewares.use('/api/tts', proxyRequest('/api/tts'))
server.middlewares.use('/api/stt', proxyRequest('/api/stt'))
// Health check — direct to bridge
server.middlewares.use('/api/health', async (req, res) => {
try {
const { default: http } = await import('http')
const start = Date.now()
await new Promise((resolve, reject) => {
const reqObj = http.get('http://localhost:8081/', { timeout: 5000 }, (resp) => {
resp.resume()
resolve()
})
reqObj.on('error', reject)
reqObj.on('timeout', () => { reqObj.destroy(); reject(new Error('timeout')) })
})
res.writeHead(200, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ status: 'online', responseTime: Date.now() - start }))
} catch {
res.writeHead(200, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ status: 'offline', responseTime: null }))
}
})
},
}
}
export default defineConfig({
plugins: [
bridgeProxyPlugin(),
tailwindcss(),
react(),
],
server: {
host: '0.0.0.0',
port: 5174,
},
})

View File

@@ -6,7 +6,7 @@
## Goal ## Goal
Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, local wake word detection, audio playback, and an LVGL animated face showing assistant state. All intelligence stays on the Mac Mini. Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, on-device wake word detection, audio playback, and a display showing assistant state via static PNG face illustrations. All intelligence stays on the Mac Mini.
--- ---
@@ -17,11 +17,12 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite
| SoC | ESP32-S3 (dual-core Xtensa, 240MHz) | | SoC | ESP32-S3 (dual-core Xtensa, 240MHz) |
| RAM | 512KB SRAM + 16MB PSRAM | | RAM | 512KB SRAM + 16MB PSRAM |
| Flash | 16MB | | Flash | 16MB |
| Display | 2.4" IPS LCD, 320×240, touchscreen | | Display | 2.4" IPS LCD, 320×240, touchscreen (ILI9xxx, model S3BOX) |
| Mic | Dual microphone array | | Audio ADC | ES7210 (dual mic array, 16kHz 16-bit) |
| Speaker | Built-in 1W speaker | | Audio DAC | ES8311 (speaker output, 48kHz 16-bit) |
| Connectivity | WiFi 802.11b/g/n, BT 5.0 | | Speaker | Built-in 1W |
| USB | USB-C (programming + power) | | Connectivity | WiFi 802.11b/g/n (2.4GHz only), BT 5.0 |
| USB | USB-C (programming + power, native USB JTAG serial) |
--- ---
@@ -29,273 +30,100 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite
``` ```
ESP32-S3-BOX-3 ESP32-S3-BOX-3
├── microWakeWord (on-device, always listening) ├── micro_wake_word (on-device, always listening)
│ └── triggers Wyoming Satellite on wake detection │ └── "hey_jarvis" — triggers voice_assistant on wake detection
├── Wyoming Satellite ├── voice_assistant (ESPHome component)
│ ├── streams mic audio → Mac Mini Wyoming STT (port 10300) │ ├── connects to Home Assistant via ESPHome API
── receives TTS audio Mac Mini Wyoming TTS (port 10301) ── HA routes audio Mac Mini Wyoming STT (10.0.0.101:10300)
├── LVGL Display │ ├── HA routes text → OpenClaw conversation agent (10.0.0.101:8081)
│ └── animated face, driven by HA entity state │ └── HA routes response → Mac Mini Wyoming TTS (10.0.0.101:10301)
├── Display (ili9xxx, model S3BOX, 320×240)
│ └── static PNG faces per state (idle, listening, thinking, replying, error)
└── ESPHome OTA └── ESPHome OTA
└── firmware updates over WiFi └── firmware updates over WiFi
``` ```
--- ---
## Pin Map (ESP32-S3-BOX-3)
| Function | Pin(s) | Notes |
|---|---|---|
| I2S LRCLK | GPIO45 | strapping pin — warning ignored |
| I2S BCLK | GPIO17 | |
| I2S MCLK | GPIO2 | |
| I2S DIN (mic) | GPIO16 | ES7210 ADC input |
| I2S DOUT (speaker) | GPIO15 | ES8311 DAC output |
| Speaker enable | GPIO46 | strapping pin — warning ignored |
| I2C SCL | GPIO18 | audio codec control bus |
| I2C SDA | GPIO8 | audio codec control bus |
| SPI CLK (display) | GPIO7 | |
| SPI MOSI (display) | GPIO6 | |
| Display CS | GPIO5 | |
| Display DC | GPIO4 | |
| Display Reset | GPIO48 | inverted |
| Backlight | GPIO47 | LEDC PWM |
| Left top button | GPIO0 | strapping pin — mute toggle / factory reset |
| Sensor dock I2C SCL | GPIO40 | sensor bus (AHT-30, AT581x radar) |
| Sensor dock I2C SDA | GPIO41 | sensor bus (AHT-30, AT581x radar) |
| Radar presence output | GPIO21 | AT581x digital detection pin |
---
## ESPHome Configuration ## ESPHome Configuration
### Base Config Template ### Platform & Framework
`esphome/base.yaml` — shared across all units:
```yaml ```yaml
esphome: esp32:
name: homeai-${room} board: esp32s3box
friendly_name: "HomeAI ${room_display}" flash_size: 16MB
platform: esp32 cpu_frequency: 240MHz
board: esp32-s3-box-3 framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
wifi: psram:
ssid: !secret wifi_ssid mode: octal
password: !secret wifi_password speed: 80MHz
ap:
ssid: "HomeAI Fallback"
api:
encryption:
key: !secret api_key
ota:
password: !secret ota_password
logger:
level: INFO
``` ```
### Room-Specific Config ### Audio Stack
`esphome/s3-box-living-room.yaml`: Uses `i2s_audio` platform with external ADC/DAC codec chips:
```yaml - **Microphone**: ES7210 ADC via I2S, 16kHz 16-bit mono
substitutions: - **Speaker**: ES8311 DAC via I2S, 48kHz 16-bit mono (left channel)
room: living-room - **Media player**: wraps speaker with volume control (min 50%, max 85%)
room_display: "Living Room"
mac_mini_ip: "192.168.1.x" # or Tailscale IP
packages: ### Wake Word
base: !include base.yaml
voice: !include voice.yaml
display: !include display.yaml
```
One file per room, only the substitutions change. On-device `micro_wake_word` component with `hey_jarvis` model. Can optionally be switched to Home Assistant streaming wake word via a selector entity.
### Voice / Wyoming Satellite — `esphome/voice.yaml` ### Display
```yaml `ili9xxx` platform with model `S3BOX`. Uses `update_interval: never` — display updates are triggered by scripts on voice assistant state changes. Static 320×240 PNG images for each state are compiled into firmware. No text overlays — voice-only interaction.
microphone:
- platform: esp_adf
id: mic
speaker: Screen auto-dims after a configurable idle timeout (default 1 min, adjustable 160 min via HA entity). Wakes on voice activity or radar presence detection.
- platform: esp_adf
id: spk
micro_wake_word: ### Sensor Dock (ESP32-S3-BOX-3-SENSOR)
model: hey_jarvis # or custom model path
on_wake_word_detected:
- voice_assistant.start:
voice_assistant: Optional accessory dock connected via secondary I2C bus (GPIO40/41, 100kHz):
microphone: mic
speaker: spk
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
on_listening: - **AHT-30** (temp/humidity) — `aht10` component with variant AHT20, 30s update interval
- display.page.show: page_listening - **AT581x mmWave radar** — presence detection via GPIO21, I2C for settings config
- script.execute: animate_face_listening - **Radar RF switch** — toggle radar on/off from HA
- Radar configured on boot: sensing_distance=600, trigger_keep=5s, hw_frontend_reset=true
on_stt_vad_end: ### Voice Assistant
- display.page.show: page_thinking
- script.execute: animate_face_thinking
on_tts_start: ESPHome's `voice_assistant` component connects to HA via the ESPHome native API (not directly to Wyoming). HA orchestrates the pipeline:
- display.page.show: page_speaking 1. Audio → Wyoming STT (Mac Mini) → text
- script.execute: animate_face_speaking 2. Text → OpenClaw conversation agent → response
3. Response → Wyoming TTS (Mac Mini) → audio back to ESP32
on_end:
- display.page.show: page_idle
- script.execute: animate_face_idle
on_error:
- display.page.show: page_error
- script.execute: animate_face_error
```
**Note:** ESPHome's `voice_assistant` component connects to HA, which routes to Wyoming STT/TTS on the Mac Mini. This is the standard ESPHome → HA → Wyoming path.
### LVGL Display — `esphome/display.yaml`
```yaml
display:
- platform: ili9xxx
model: ILI9341
id: lcd
cs_pin: GPIO5
dc_pin: GPIO4
reset_pin: GPIO48
touchscreen:
- platform: tt21100
id: touch
lvgl:
displays:
- lcd
touchscreens:
- touch
# Face widget — centered on screen
widgets:
- obj:
id: face_container
width: 320
height: 240
bg_color: 0x000000
children:
# Eyes (two circles)
- obj:
id: eye_left
x: 90
y: 90
width: 50
height: 50
radius: 25
bg_color: 0xFFFFFF
- obj:
id: eye_right
x: 180
y: 90
width: 50
height: 50
radius: 25
bg_color: 0xFFFFFF
# Mouth (line/arc)
- arc:
id: mouth
x: 110
y: 160
width: 100
height: 40
start_angle: 180
end_angle: 360
arc_color: 0xFFFFFF
pages:
- id: page_idle
- id: page_listening
- id: page_thinking
- id: page_speaking
- id: page_error
```
### LVGL Face State Animations — `esphome/animations.yaml`
```yaml
script:
- id: animate_face_idle
then:
- lvgl.widget.modify:
id: eye_left
height: 50 # normal open
- lvgl.widget.modify:
id: eye_right
height: 50
- lvgl.widget.modify:
id: mouth
arc_color: 0xFFFFFF
- id: animate_face_listening
then:
- lvgl.widget.modify:
id: eye_left
height: 60 # wider eyes
- lvgl.widget.modify:
id: eye_right
height: 60
- lvgl.widget.modify:
id: mouth
arc_color: 0x00BFFF # blue tint
- id: animate_face_thinking
then:
- lvgl.widget.modify:
id: eye_left
height: 20 # squinting
- lvgl.widget.modify:
id: eye_right
height: 20
- id: animate_face_speaking
then:
- lvgl.widget.modify:
id: mouth
arc_color: 0x00FF88 # green speaking indicator
- id: animate_face_error
then:
- lvgl.widget.modify:
id: eye_left
bg_color: 0xFF2200 # red eyes
- lvgl.widget.modify:
id: eye_right
bg_color: 0xFF2200
```
> **Note:** True lip-sync animation (mouth moving with audio) is complex on ESP32. Phase 1: static states. Phase 2: amplitude-driven mouth height using speaker volume feedback.
---
## Secrets File
`esphome/secrets.yaml` (gitignored):
```yaml
wifi_ssid: "YourNetwork"
wifi_password: "YourPassword"
api_key: "<32-byte base64 key>"
ota_password: "YourOTAPassword"
```
---
## Flash & Deployment Workflow
```bash
# Install ESPHome
pip install esphome
# Compile + flash via USB (first time)
esphome run esphome/s3-box-living-room.yaml
# OTA update (subsequent)
esphome upload esphome/s3-box-living-room.yaml --device <device-ip>
# View logs
esphome logs esphome/s3-box-living-room.yaml
```
---
## Home Assistant Integration
After flashing:
1. HA discovers ESP32 automatically via mDNS
2. Add device in HA → Settings → Devices
3. Assign Wyoming voice assistant pipeline to the device
4. Set up room-specific automations (e.g., "Living Room" light control from that satellite)
--- ---
@@ -303,43 +131,71 @@ After flashing:
``` ```
homeai-esp32/ homeai-esp32/
├── PLAN.md
├── setup.sh # env check + flash/ota/logs commands
└── esphome/ └── esphome/
├── base.yaml ├── secrets.yaml # gitignored — WiFi + API key
├── voice.yaml ├── homeai-living-room.yaml # first unit (full config)
├── display.yaml ├── homeai-bedroom.yaml # future: copy + change substitutions
├── animations.yaml ├── homeai-kitchen.yaml # future: copy + change substitutions
── s3-box-living-room.yaml ── illustrations/ # 320×240 PNG face images
├── s3-box-bedroom.yaml # template, fill in when hardware available ├── idle.png
├── s3-box-kitchen.yaml # template ├── loading.png
└── secrets.yaml # gitignored ├── listening.png
├── thinking.png
├── replying.png
├── error.png
└── timer_finished.png
``` ```
--- ---
## Wake Word Decisions ## ESPHome Environment
```bash
# Dedicated venv (Python 3.12) — do NOT share with voice/whisper venvs
~/homeai-esphome-env/bin/esphome version # ESPHome 2026.2.4+
# Quick commands
cd ~/gitea/homeai/homeai-esp32
~/homeai-esphome-env/bin/esphome run esphome/homeai-living-room.yaml # compile + flash
~/homeai-esphome-env/bin/esphome logs esphome/homeai-living-room.yaml # stream logs
# Or use the setup script
./setup.sh flash # compile + USB flash
./setup.sh ota # compile + OTA update
./setup.sh logs # stream device logs
./setup.sh validate # check YAML without compiling
```
---
## Wake Word Options
| Option | Latency | Privacy | Effort | | Option | Latency | Privacy | Effort |
|---|---|---|---| |---|---|---|---|
| `hey_jarvis` (built-in microWakeWord) | ~200ms | On-device | Zero | | `hey_jarvis` (built-in micro_wake_word) | ~200ms | On-device | Zero |
| Custom word (trained model) | ~200ms | On-device | High — requires 50+ recordings | | Custom word (trained model) | ~200ms | On-device | High — requires 50+ recordings |
| Mac Mini openWakeWord (stream audio) | ~500ms | On Mac | Medium | | HA streaming wake word | ~500ms | On Mac Mini | Medium — stream all audio |
**Recommendation:** Start with `hey_jarvis`. Train a custom word (character's name) once character name is finalised. **Current**: `hey_jarvis` on-device. Train a custom word (character's name) once finalised.
--- ---
## Implementation Steps ## Implementation Steps
- [ ] Install ESPHome: `pip install esphome` - [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12)
- [ ] Write `esphome/secrets.yaml` (gitignored) - [x] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml` - [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
- [ ] Write `s3-box-living-room.yaml` for first unit - [x] Generate placeholder face illustrations (7 PNGs, 320×240)
- [ ] Flash first unit via USB: `esphome run s3-box-living-room.yaml` - [x] Write `setup.sh` with flash/ota/logs/validate commands
- [ ] Verify unit appears in HA device list - [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
- [ ] Assign Wyoming voice pipeline to unit in HA - [x] Flash first unit via USB (living room)
- [ ] Test: speak wake word → transcription → LLM response → spoken reply - [x] Verify unit appears in HA device list
- [ ] Test: LVGL face cycles through idle → listening → thinking → speaking - [x] Assign Wyoming voice pipeline to unit in HA
- [ ] Verify OTA update works: change LVGL color, deploy wirelessly - [x] Test: speak wake word → transcription → LLM response → spoken reply
- [x] Test: display cycles through idle → listening → thinking → replying
- [x] Verify OTA update works: change config, deploy wirelessly
- [ ] Write config templates for remaining rooms (bedroom, kitchen) - [ ] Write config templates for remaining rooms (bedroom, kitchen)
- [ ] Flash remaining units, verify each works independently - [ ] Flash remaining units, verify each works independently
- [ ] Document final MAC address → room name mapping - [ ] Document final MAC address → room name mapping
@@ -351,7 +207,17 @@ homeai-esp32/
- [ ] Wake word "hey jarvis" triggers pipeline reliably from 3m distance - [ ] Wake word "hey jarvis" triggers pipeline reliably from 3m distance
- [ ] STT transcription accuracy >90% for clear speech in quiet room - [ ] STT transcription accuracy >90% for clear speech in quiet room
- [ ] TTS audio plays clearly through ESP32 speaker - [ ] TTS audio plays clearly through ESP32 speaker
- [ ] LVGL face shows correct state for idle / listening / thinking / speaking / error - [ ] Display shows correct state for idle / listening / thinking / replying / error / muted
- [ ] OTA firmware updates work without USB cable - [ ] OTA firmware updates work without USB cable
- [ ] Unit reconnects automatically after WiFi drop - [ ] Unit reconnects automatically after WiFi drop
- [ ] Unit survives power cycle and resumes normal operation - [ ] Unit survives power cycle and resumes normal operation
---
## Known Constraints
- **Memory**: voice_assistant + micro_wake_word + display + sensor dock is near the limit. Do NOT add Bluetooth or LVGL widgets — they will cause crashes.
- **WiFi**: 2.4GHz only. 5GHz networks are not supported.
- **Speaker**: 1W built-in. Volume capped at 85% to avoid distortion.
- **Display**: Static PNGs compiled into firmware. To change images, reflash via OTA (~1-2 min).
- **First compile**: Downloads ESP-IDF toolchain (~500MB), takes 5-10 minutes. Incremental builds are 1-2 minutes.

263
homeai-esp32/deploy.sh Executable file
View File

@@ -0,0 +1,263 @@
#!/usr/bin/env bash
# homeai-esp32/deploy.sh — Quick OTA deploy for ESP32-S3-BOX-3 satellites
#
# Usage:
# ./deploy.sh — deploy config + images to living room (default)
# ./deploy.sh bedroom — deploy to bedroom unit
# ./deploy.sh --images-only — deploy existing PNGs from illustrations/ (no regen)
# ./deploy.sh --regen-images — regenerate placeholder PNGs then deploy
# ./deploy.sh --validate — validate config without deploying
# ./deploy.sh --all — deploy to all configured units
#
# Images are compiled into firmware, so any PNG changes require a reflash.
# To use custom images: drop 320x240 PNGs into esphome/illustrations/ then ./deploy.sh
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ESPHOME_DIR="${SCRIPT_DIR}/esphome"
ESPHOME_VENV="${HOME}/homeai-esphome-env"
ESPHOME="${ESPHOME_VENV}/bin/esphome"
PYTHON="${ESPHOME_VENV}/bin/python3"
ILLUSTRATIONS_DIR="${ESPHOME_DIR}/illustrations"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; exit 1; }
log_step() { echo -e "${CYAN}[STEP]${NC} $*"; }
# ─── Available units ──────────────────────────────────────────────────────────
UNIT_NAMES=(living-room bedroom kitchen)
DEFAULT_UNIT="living-room"
unit_config() {
case "$1" in
living-room) echo "homeai-living-room.yaml" ;;
bedroom) echo "homeai-bedroom.yaml" ;;
kitchen) echo "homeai-kitchen.yaml" ;;
*) echo "" ;;
esac
}
unit_list() {
echo "${UNIT_NAMES[*]}"
}
# ─── Face image generator ────────────────────────────────────────────────────
generate_faces() {
log_step "Generating face illustrations (320x240 PNG)..."
"${PYTHON}" << 'PYEOF'
from PIL import Image, ImageDraw
import os
WIDTH, HEIGHT = 320, 240
OUT = os.environ.get("ILLUSTRATIONS_DIR", "esphome/illustrations")
def draw_face(draw, eye_color, mouth_color, eye_height=40, eye_y=80, mouth_style="smile"):
ex1, ey1 = 95, eye_y
draw.ellipse([ex1-25, ey1-eye_height//2, ex1+25, ey1+eye_height//2], fill=eye_color)
ex2, ey2 = 225, eye_y
draw.ellipse([ex2-25, ey2-eye_height//2, ex2+25, ey2+eye_height//2], fill=eye_color)
if mouth_style == "smile":
draw.arc([110, 140, 210, 200], start=0, end=180, fill=mouth_color, width=3)
elif mouth_style == "open":
draw.ellipse([135, 150, 185, 190], fill=mouth_color)
elif mouth_style == "flat":
draw.line([120, 170, 200, 170], fill=mouth_color, width=3)
elif mouth_style == "frown":
draw.arc([110, 160, 210, 220], start=180, end=360, fill=mouth_color, width=3)
states = {
"idle": {"eye_color": "#FFFFFF", "mouth_color": "#FFFFFF", "eye_height": 40, "mouth_style": "smile"},
"loading": {"eye_color": "#6366F1", "mouth_color": "#6366F1", "eye_height": 30, "mouth_style": "flat"},
"listening": {"eye_color": "#00BFFF", "mouth_color": "#00BFFF", "eye_height": 50, "mouth_style": "open"},
"thinking": {"eye_color": "#A78BFA", "mouth_color": "#A78BFA", "eye_height": 20, "mouth_style": "flat"},
"replying": {"eye_color": "#10B981", "mouth_color": "#10B981", "eye_height": 40, "mouth_style": "open"},
"error": {"eye_color": "#EF4444", "mouth_color": "#EF4444", "eye_height": 40, "mouth_style": "frown"},
"timer_finished": {"eye_color": "#F59E0B", "mouth_color": "#F59E0B", "eye_height": 50, "mouth_style": "smile"},
}
os.makedirs(OUT, exist_ok=True)
for name, p in states.items():
img = Image.new("RGBA", (WIDTH, HEIGHT), (0, 0, 0, 255))
draw = ImageDraw.Draw(img)
draw_face(draw, p["eye_color"], p["mouth_color"], p["eye_height"], mouth_style=p["mouth_style"])
img.save(f"{OUT}/{name}.png")
print(f" {name}.png")
PYEOF
log_ok "Generated 7 face illustrations"
}
# ─── Check existing images ───────────────────────────────────────────────────
REQUIRED_IMAGES=(idle loading listening thinking replying error timer_finished)
check_images() {
local missing=()
for name in "${REQUIRED_IMAGES[@]}"; do
if [[ ! -f "${ILLUSTRATIONS_DIR}/${name}.png" ]]; then
missing+=("${name}.png")
fi
done
if [[ ${#missing[@]} -gt 0 ]]; then
log_error "Missing illustrations: ${missing[*]}
Place 320x240 PNGs in ${ILLUSTRATIONS_DIR}/ or use --regen-images to generate placeholders."
fi
# Resize any images that aren't 320x240
local resized=0
for name in "${REQUIRED_IMAGES[@]}"; do
local img_path="${ILLUSTRATIONS_DIR}/${name}.png"
local dims
dims=$("${PYTHON}" -c "from PIL import Image; im=Image.open('${img_path}'); print(f'{im.width}x{im.height}')")
if [[ "$dims" != "320x240" ]]; then
log_warn "${name}.png is ${dims}, resizing to 320x240..."
"${PYTHON}" -c "
from PIL import Image
im = Image.open('${img_path}')
im = im.resize((320, 240), Image.LANCZOS)
im.save('${img_path}')
"
resized=$((resized + 1))
fi
done
if [[ $resized -gt 0 ]]; then
log_ok "Resized ${resized} image(s) to 320x240"
fi
log_ok "All ${#REQUIRED_IMAGES[@]} illustrations present and 320x240"
for name in "${REQUIRED_IMAGES[@]}"; do
local size
size=$(wc -c < "${ILLUSTRATIONS_DIR}/${name}.png" | tr -d ' ')
echo -e " ${name}.png (${size} bytes)"
done
}
# ─── Deploy to a single unit ─────────────────────────────────────────────────
deploy_unit() {
local unit_name="$1"
local config
config="$(unit_config "$unit_name")"
if [[ -z "$config" ]]; then
log_error "Unknown unit: ${unit_name}. Available: $(unit_list)"
fi
local config_path="${ESPHOME_DIR}/${config}"
if [[ ! -f "$config_path" ]]; then
log_error "Config not found: ${config_path}"
fi
log_step "Validating ${config}..."
cd "${ESPHOME_DIR}"
"${ESPHOME}" config "${config}" > /dev/null
log_ok "Config valid"
log_step "Compiling + OTA deploying ${config}..."
"${ESPHOME}" run "${config}" --device OTA 2>&1
log_ok "Deployed to ${unit_name}"
}
# ─── Main ─────────────────────────────────────────────────────────────────────
IMAGES_ONLY=false
REGEN_IMAGES=false
VALIDATE_ONLY=false
DEPLOY_ALL=false
TARGET="${DEFAULT_UNIT}"
while [[ $# -gt 0 ]]; do
case "$1" in
--images-only) IMAGES_ONLY=true; shift ;;
--regen-images) REGEN_IMAGES=true; shift ;;
--validate) VALIDATE_ONLY=true; shift ;;
--all) DEPLOY_ALL=true; shift ;;
--help|-h)
echo "Usage: $0 [unit-name] [--images-only] [--regen-images] [--validate] [--all]"
echo ""
echo "Units: $(unit_list)"
echo ""
echo "Options:"
echo " --images-only Deploy existing PNGs from illustrations/ (for custom images)"
echo " --regen-images Regenerate placeholder face PNGs then deploy"
echo " --validate Validate config without deploying"
echo " --all Deploy to all configured units"
echo ""
echo "Examples:"
echo " $0 # deploy config to living-room"
echo " $0 bedroom # deploy to bedroom"
echo " $0 --images-only # deploy with current images (custom or generated)"
echo " $0 --regen-images # regenerate placeholder faces + deploy"
echo " $0 --all # deploy to all units"
echo ""
echo "Custom images: drop 320x240 PNGs into esphome/illustrations/"
echo "Required files: ${REQUIRED_IMAGES[*]}"
exit 0
;;
*)
if [[ -n "$(unit_config "$1")" ]]; then
TARGET="$1"
else
log_error "Unknown option or unit: $1. Use --help for usage."
fi
shift
;;
esac
done
# Check ESPHome
if [[ ! -x "${ESPHOME}" ]]; then
log_error "ESPHome not found at ${ESPHOME}. Run setup.sh first."
fi
# Regenerate placeholder images if requested
if $REGEN_IMAGES; then
export ILLUSTRATIONS_DIR
generate_faces
fi
# Check existing images (verify present + resize if not 320x240)
check_images
# Validate only
if $VALIDATE_ONLY; then
cd "${ESPHOME_DIR}"
for unit_name in "${UNIT_NAMES[@]}"; do
config="$(unit_config "$unit_name")"
if [[ -f "${config}" ]]; then
log_step "Validating ${config}..."
"${ESPHOME}" config "${config}" > /dev/null && log_ok "${config} valid" || log_warn "${config} invalid"
fi
done
exit 0
fi
# Deploy
if $DEPLOY_ALL; then
for unit_name in "${UNIT_NAMES[@]}"; do
config="$(unit_config "$unit_name")"
if [[ -f "${ESPHOME_DIR}/${config}" ]]; then
deploy_unit "$unit_name"
else
log_warn "Skipping ${unit_name}${config} not found"
fi
done
else
deploy_unit "$TARGET"
fi
echo ""
log_ok "Deploy complete!"

5
homeai-esp32/esphome/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
# Gitignore settings for ESPHome
# This is an example and may include too much for your use-case.
# You can modify this file to suit your needs.
/.esphome/
/secrets.yaml

View File

@@ -0,0 +1,885 @@
---
# HomeAI Living Room Satellite — ESP32-S3-BOX-3
# Based on official ESPHome voice assistant config
# https://github.com/esphome/wake-word-voice-assistants
substitutions:
name: homeai-living-room
friendly_name: HomeAI Living Room
# Face illustrations — compiled into firmware (320x240 PNG)
loading_illustration_file: illustrations/loading.png
idle_illustration_file: illustrations/idle.png
listening_illustration_file: illustrations/listening.png
thinking_illustration_file: illustrations/thinking.png
replying_illustration_file: illustrations/replying.png
error_illustration_file: illustrations/error.png
timer_finished_illustration_file: illustrations/timer_finished.png
# Dark background for all states (matches HomeAI dashboard theme)
loading_illustration_background_color: "000000"
idle_illustration_background_color: "000000"
listening_illustration_background_color: "000000"
thinking_illustration_background_color: "000000"
replying_illustration_background_color: "000000"
error_illustration_background_color: "000000"
voice_assist_idle_phase_id: "1"
voice_assist_listening_phase_id: "2"
voice_assist_thinking_phase_id: "3"
voice_assist_replying_phase_id: "4"
voice_assist_not_ready_phase_id: "10"
voice_assist_error_phase_id: "11"
voice_assist_muted_phase_id: "12"
voice_assist_timer_finished_phase_id: "20"
font_family: Figtree
font_glyphsets: "GF_Latin_Core"
esphome:
name: ${name}
friendly_name: ${friendly_name}
min_version: 2025.5.0
name_add_mac_suffix: false
on_boot:
priority: 600
then:
- script.execute: draw_display
- at581x.settings:
id: radar
hw_frontend_reset: true
sensing_distance: 600
trigger_keep: 5000ms
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: draw_display
esp32:
board: esp32s3box
flash_size: 16MB
cpu_frequency: 240MHz
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
psram:
mode: octal
speed: 80MHz
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
ap:
ssid: "HomeAI Fallback"
on_connect:
- script.execute: draw_display
on_disconnect:
- script.execute: draw_display
captive_portal:
api:
encryption:
key: !secret api_key
# Prevent device from rebooting if HA connection drops temporarily
reboot_timeout: 0s
on_client_connected:
- script.execute: draw_display
on_client_disconnected:
# Debounce: wait 5s before showing "HA not found" to avoid flicker on brief drops
- delay: 5s
- if:
condition:
not:
api.connected:
then:
- script.execute: draw_display
ota:
- platform: esphome
id: ota_esphome
logger:
hardware_uart: USB_SERIAL_JTAG
button:
- platform: factory_reset
id: factory_reset_btn
internal: true
binary_sensor:
- platform: gpio
pin:
number: GPIO0
ignore_strapping_warning: true
mode: INPUT_PULLUP
inverted: true
id: left_top_button
internal: true
on_multi_click:
# Short press: dismiss timer / toggle mute
- timing:
- ON for at least 50ms
- OFF for at least 50ms
then:
- if:
condition:
switch.is_on: timer_ringing
then:
- switch.turn_off: timer_ringing
else:
- switch.toggle: mute
# Long press (10s): factory reset
- timing:
- ON for at least 10s
then:
- button.press: factory_reset_btn
- platform: gpio
pin: GPIO21
name: Presence
id: radar_presence
device_class: occupancy
on_press:
- script.execute: screen_wake
- script.execute: screen_idle_timer
# --- Display backlight ---
output:
- platform: ledc
pin: GPIO47
id: backlight_output
light:
- platform: monochromatic
id: led
name: Screen
icon: "mdi:television"
entity_category: config
output: backlight_output
restore_mode: RESTORE_DEFAULT_ON
default_transition_length: 250ms
# --- Audio hardware ---
i2c:
- id: audio_bus
scl: GPIO18
sda: GPIO8
- id: sensor_bus
scl: GPIO40
sda: GPIO41
frequency: 100kHz
i2s_audio:
- id: i2s_audio_bus
i2s_lrclk_pin:
number: GPIO45
ignore_strapping_warning: true
i2s_bclk_pin: GPIO17
i2s_mclk_pin: GPIO2
audio_adc:
- platform: es7210
id: es7210_adc
i2c_id: audio_bus
bits_per_sample: 16bit
sample_rate: 16000
audio_dac:
- platform: es8311
id: es8311_dac
i2c_id: audio_bus
bits_per_sample: 16bit
sample_rate: 48000
microphone:
- platform: i2s_audio
id: box_mic
sample_rate: 16000
i2s_din_pin: GPIO16
bits_per_sample: 16bit
adc_type: external
speaker:
- platform: i2s_audio
id: box_speaker
i2s_dout_pin: GPIO15
dac_type: external
sample_rate: 48000
bits_per_sample: 16bit
channel: left
audio_dac: es8311_dac
buffer_duration: 100ms
media_player:
- platform: speaker
name: None
id: speaker_media_player
volume_min: 0.5
volume_max: 0.85
announcement_pipeline:
speaker: box_speaker
format: FLAC
sample_rate: 48000
num_channels: 1
files:
- id: timer_finished_sound
file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/timer_finished.flac
on_announcement:
- if:
condition:
- microphone.is_capturing:
then:
- script.execute: stop_wake_word
- if:
condition:
- lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- wait_until:
- not:
voice_assistant.is_running:
- if:
condition:
not:
voice_assistant.is_running:
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
on_idle:
- if:
condition:
not:
voice_assistant.is_running:
then:
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
# --- Wake word (on-device) ---
micro_wake_word:
id: mww
models:
- hey_jarvis
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
# --- Voice assistant ---
voice_assistant:
id: va
microphone: box_mic
media_player: speaker_media_player
micro_wake_word: mww
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- script.execute: draw_display
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- script.execute: draw_display
on_tts_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- script.execute: draw_display
on_end:
- wait_until:
condition:
- media_player.is_announcing:
timeout: 0.5s
- wait_until:
- and:
- not:
media_player.is_announcing:
- not:
speaker.is_playing:
- if:
condition:
- lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: draw_display
- delay: 1s
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
on_client_connected:
- lambda: id(init_in_progress) = false;
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
on_client_disconnected:
- script.execute: stop_wake_word
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- script.execute: draw_display
on_timer_started:
- script.execute: draw_display
on_timer_cancelled:
- script.execute: draw_display
on_timer_updated:
- script.execute: draw_display
on_timer_tick:
- script.execute: draw_display
on_timer_finished:
- switch.turn_on: timer_ringing
- wait_until:
media_player.is_announcing:
- lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
- script.execute: draw_display
# --- Scripts ---
script:
- id: draw_display
then:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- if:
condition:
wifi.connected:
then:
- if:
condition:
api.connected:
then:
- lambda: |
switch(id(voice_assistant_phase)) {
case ${voice_assist_listening_phase_id}:
id(screen_wake).execute();
id(s3_box_lcd).show_page(listening_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_thinking_phase_id}:
id(screen_wake).execute();
id(s3_box_lcd).show_page(thinking_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_replying_phase_id}:
id(screen_wake).execute();
id(s3_box_lcd).show_page(replying_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_error_phase_id}:
id(screen_wake).execute();
id(s3_box_lcd).show_page(error_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_muted_phase_id}:
id(s3_box_lcd).show_page(muted_page);
id(s3_box_lcd).update();
id(screen_idle_timer).execute();
break;
case ${voice_assist_not_ready_phase_id}:
id(s3_box_lcd).show_page(no_ha_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_timer_finished_phase_id}:
id(screen_wake).execute();
id(s3_box_lcd).show_page(timer_finished_page);
id(s3_box_lcd).update();
break;
default:
id(s3_box_lcd).show_page(idle_page);
id(s3_box_lcd).update();
id(screen_idle_timer).execute();
}
else:
- display.page.show: no_ha_page
- component.update: s3_box_lcd
else:
- display.page.show: no_wifi_page
- component.update: s3_box_lcd
else:
- display.page.show: initializing_page
- component.update: s3_box_lcd
- id: fetch_first_active_timer
then:
- lambda: |
const auto &timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (const auto &timer : timers) {
if (timer.second.is_active && timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = timer.second;
}
}
id(global_first_active_timer) = output_timer;
- id: check_if_timers_active
then:
- lambda: |
const auto &timers = id(va).get_timers();
bool output = false;
for (const auto &timer : timers) {
if (timer.second.is_active) { output = true; }
}
id(global_is_timer_active) = output;
- id: fetch_first_timer
then:
- lambda: |
const auto &timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (const auto &timer : timers) {
if (timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = timer.second;
}
}
id(global_first_timer) = output_timer;
- id: check_if_timers
then:
- lambda: |
const auto &timers = id(va).get_timers();
bool output = false;
for (const auto &timer : timers) {
if (timer.second.is_active) { output = true; }
}
id(global_is_timer) = output;
- id: draw_timer_timeline
then:
- lambda: |
id(check_if_timers_active).execute();
id(check_if_timers).execute();
if (id(global_is_timer_active)){
id(fetch_first_active_timer).execute();
int active_pixels = round( 320 * id(global_first_active_timer).seconds_left / max(id(global_first_active_timer).total_seconds, static_cast<uint32_t>(1)) );
if (active_pixels > 0){
id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(active_timer_color));
}
} else if (id(global_is_timer)){
id(fetch_first_timer).execute();
int active_pixels = round( 320 * id(global_first_timer).seconds_left / max(id(global_first_timer).total_seconds, static_cast<uint32_t>(1)));
if (active_pixels > 0){
id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(paused_timer_color));
}
}
- id: draw_active_timer_widget
then:
- lambda: |
id(check_if_timers_active).execute();
if (id(global_is_timer_active)){
id(s3_box_lcd).filled_rectangle(80, 40, 160, 50, Color::WHITE);
id(s3_box_lcd).rectangle(80, 40, 160, 50, Color::BLACK);
id(fetch_first_active_timer).execute();
int hours_left = floor(id(global_first_active_timer).seconds_left / 3600);
int minutes_left = floor((id(global_first_active_timer).seconds_left - hours_left * 3600) / 60);
int seconds_left = id(global_first_active_timer).seconds_left - hours_left * 3600 - minutes_left * 60;
auto display_hours = (hours_left < 10 ? "0" : "") + std::to_string(hours_left);
auto display_minute = (minutes_left < 10 ? "0" : "") + std::to_string(minutes_left);
auto display_seconds = (seconds_left < 10 ? "0" : "") + std::to_string(seconds_left);
std::string display_string = "";
if (hours_left > 0) {
display_string = display_hours + ":" + display_minute;
} else {
display_string = display_minute + ":" + display_seconds;
}
id(s3_box_lcd).printf(120, 47, id(font_timer), Color::BLACK, "%s", display_string.c_str());
}
- id: start_wake_word
then:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- id: stop_wake_word
then:
- if:
condition:
lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop:
- if:
condition:
lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- micro_wake_word.stop:
- id: set_idle_or_mute_phase
then:
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- id: screen_idle_timer
mode: restart
then:
- delay: !lambda return id(screen_off_delay).state * 60000;
- light.turn_off: led
- id: screen_wake
mode: restart
then:
- if:
condition:
light.is_off: led
then:
- light.turn_on:
id: led
brightness: 100%
# --- Switches ---
switch:
- platform: gpio
name: Speaker Enable
pin:
number: GPIO46
ignore_strapping_warning: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
disabled_by_default: true
- platform: at581x
at581x_id: radar
name: Radar RF
entity_category: config
- platform: template
name: Mute
id: mute
icon: "mdi:microphone-off"
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
entity_category: config
on_turn_off:
- microphone.unmute:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: draw_display
on_turn_on:
- microphone.mute:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
- platform: template
id: timer_ringing
optimistic: true
internal: true
restore_mode: ALWAYS_OFF
on_turn_off:
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_OFF)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 0);
- media_player.stop:
announcement: true
on_turn_on:
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_ONE)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 1000);
- media_player.speaker.play_on_device_media_file:
media_file: timer_finished_sound
announcement: true
- delay: 15min
- switch.turn_off: timer_ringing
# --- Wake word engine location selector ---
select:
- platform: template
entity_category: config
name: Wake word engine location
id: wake_word_engine_location
icon: "mdi:account-voice"
optimistic: true
restore_value: true
options:
- In Home Assistant
- On device
initial_option: On device
on_value:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- wait_until:
lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
- if:
condition:
lambda: return x == "In Home Assistant";
then:
- micro_wake_word.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return x == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- micro_wake_word.start
# --- Screen idle timeout (minutes) ---
number:
- platform: template
name: Screen off delay
id: screen_off_delay
icon: "mdi:timer-outline"
entity_category: config
unit_of_measurement: min
optimistic: true
restore_value: true
min_value: 1
max_value: 60
step: 1
initial_value: 1
# --- Sensor dock (ESP32-S3-BOX-3-SENSOR) ---
sensor:
- platform: aht10
variant: AHT20
i2c_id: sensor_bus
temperature:
name: Temperature
filters:
- sliding_window_moving_average:
window_size: 5
send_every: 5
humidity:
name: Humidity
filters:
- sliding_window_moving_average:
window_size: 5
send_every: 5
update_interval: 30s
at581x:
i2c_id: sensor_bus
id: radar
# --- Global variables ---
globals:
- id: init_in_progress
type: bool
restore_value: false
initial_value: "true"
- id: voice_assistant_phase
type: int
restore_value: false
initial_value: ${voice_assist_not_ready_phase_id}
- id: global_first_active_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer_active
type: bool
restore_value: false
- id: global_first_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer
type: bool
restore_value: false
# --- Display images ---
image:
- file: ${error_illustration_file}
id: casita_error
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${idle_illustration_file}
id: casita_idle
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${listening_illustration_file}
id: casita_listening
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${thinking_illustration_file}
id: casita_thinking
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${replying_illustration_file}
id: casita_replying
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${timer_finished_illustration_file}
id: casita_timer_finished
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${loading_illustration_file}
id: casita_initializing
resize: 320x240
type: RGB
transparency: alpha_channel
- file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-wifi.png
id: error_no_wifi
resize: 320x240
type: RGB
transparency: alpha_channel
- file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-ha.png
id: error_no_ha
resize: 320x240
type: RGB
transparency: alpha_channel
# --- Fonts (timer widget only) ---
font:
- file:
type: gfonts
family: ${font_family}
weight: 300
id: font_timer
size: 30
glyphsets:
- ${font_glyphsets}
# --- Colors ---
color:
- id: idle_color
hex: ${idle_illustration_background_color}
- id: listening_color
hex: ${listening_illustration_background_color}
- id: thinking_color
hex: ${thinking_illustration_background_color}
- id: replying_color
hex: ${replying_illustration_background_color}
- id: loading_color
hex: ${loading_illustration_background_color}
- id: error_color
hex: ${error_illustration_background_color}
- id: active_timer_color
hex: "26ed3a"
- id: paused_timer_color
hex: "3b89e3"
# --- SPI + Display ---
spi:
- id: spi_bus
clk_pin: 7
mosi_pin: 6
display:
- platform: ili9xxx
id: s3_box_lcd
model: S3BOX
invert_colors: false
data_rate: 40MHz
cs_pin: 5
dc_pin: 4
reset_pin:
number: 48
inverted: true
update_interval: never
pages:
- id: idle_page
lambda: |-
it.fill(id(idle_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
id(draw_active_timer_widget).execute();
- id: listening_page
lambda: |-
it.fill(id(listening_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
- id: thinking_page
lambda: |-
it.fill(id(thinking_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
- id: replying_page
lambda: |-
it.fill(id(replying_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
- id: timer_finished_page
lambda: |-
it.fill(id(idle_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_timer_finished), ImageAlign::CENTER);
- id: error_page
lambda: |-
it.fill(id(error_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
- id: no_ha_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
- id: no_wifi_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
- id: initializing_page
lambda: |-
it.fill(id(loading_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
- id: muted_page
lambda: |-
it.fill(Color::BLACK);
id(draw_timer_timeline).execute();
id(draw_active_timer_widget).execute();

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

View File

@@ -1,76 +1,177 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# homeai-esp32/setup.sh — P6: ESPHome firmware for ESP32-S3-BOX-3 # homeai-esp32/setup.sh — P6: ESPHome firmware for ESP32-S3-BOX-3
# #
# Components: # Usage:
# - ESPHomefirmware build + flash tool # ./setup.sh check environment + validate config
# - base.yaml — shared device config # ./setup.sh flash — compile + flash via USB (first time)
# - voice.yaml — Wyoming Satellite + microWakeWord # ./setup.sh ota — compile + flash via OTA (wireless)
# - display.yaml — LVGL animated face # ./setup.sh logs — stream device logs
# - Per-room configs — s3-box-living-room.yaml, etc. # ./setup.sh validate — validate YAML without compiling
# #
# Prerequisites: # Prerequisites:
# - P1 (homeai-infra) — Home Assistant running # - ~/homeai-esphome-env — Python 3.12 venv with ESPHome
# - P3 (homeai-voice) — Wyoming STT/TTS running (ports 10300/10301) # - Home Assistant running on 10.0.0.199
# - Python 3.10+ # - Wyoming STT/TTS running on Mac Mini (ports 10300/10301)
# - USB-C cable for first flash (subsequent updates via OTA)
# - On Linux: ensure user is in the dialout group for USB access
set -euo pipefail set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh" ESPHOME_VENV="${HOME}/homeai-esphome-env"
ESPHOME="${ESPHOME_VENV}/bin/esphome"
ESPHOME_DIR="${SCRIPT_DIR}/esphome"
DEFAULT_CONFIG="${ESPHOME_DIR}/homeai-living-room.yaml"
log_section "P6: ESP32 Firmware (ESPHome)" # Colors
detect_platform RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# ─── Prerequisite check ──────────────────────────────────────────────────────── log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_info "Checking prerequisites..." log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; }
if ! command_exists python3; then # ─── Environment checks ──────────────────────────────────────────────────────
log_warn "python3 not found — required for ESPHome"
fi
if ! command_exists esphome; then check_env() {
log_info "ESPHome not installed. To install: pip install esphome" local ok=true
fi
if [[ "$OS_TYPE" == "linux" ]]; then log_info "Checking environment..."
if ! groups "$USER" | grep -q dialout; then
log_warn "User '$USER' not in 'dialout' group — USB flashing may fail." # ESPHome venv
log_warn "Fix: sudo usermod -aG dialout $USER (then log out and back in)" if [[ -x "${ESPHOME}" ]]; then
local version
version=$("${ESPHOME}" version 2>/dev/null)
log_ok "ESPHome: ${version}"
else
log_error "ESPHome not found at ${ESPHOME}"
echo " Install: /opt/homebrew/opt/python@3.12/bin/python3.12 -m venv ${ESPHOME_VENV}"
echo " ${ESPHOME_VENV}/bin/pip install 'esphome>=2025.5.0'"
ok=false
fi fi
fi
# Check P3 dependency # secrets.yaml
if ! curl -sf http://localhost:8123 -o /dev/null 2>/dev/null; then if [[ -f "${ESPHOME_DIR}/secrets.yaml" ]]; then
log_warn "Home Assistant (P1) not reachable — ESP32 units won't auto-discover" if grep -q "YOUR_" "${ESPHOME_DIR}/secrets.yaml" 2>/dev/null; then
fi log_warn "secrets.yaml contains placeholder values — edit before flashing"
ok=false
else
log_ok "secrets.yaml configured"
fi
else
log_error "secrets.yaml not found at ${ESPHOME_DIR}/secrets.yaml"
ok=false
fi
# ─── TODO: Implementation ────────────────────────────────────────────────────── # Config file
cat <<'EOF' if [[ -f "${DEFAULT_CONFIG}" ]]; then
log_ok "Config: $(basename "${DEFAULT_CONFIG}")"
else
log_error "Config not found: ${DEFAULT_CONFIG}"
ok=false
fi
┌─────────────────────────────────────────────────────────────────┐ # Illustrations
│ P6: homeai-esp32 — NOT YET IMPLEMENTED │ local illust_dir="${ESPHOME_DIR}/illustrations"
│ │ local illust_count
│ Implementation steps: │ illust_count=$(find "${illust_dir}" -name "*.png" 2>/dev/null | wc -l | tr -d ' ')
│ 1. pip install esphome │ if [[ "${illust_count}" -ge 7 ]]; then
2. Create esphome/secrets.yaml (gitignored) │ log_ok "Illustrations: ${illust_count} PNGs in illustrations/"
│ 3. Create esphome/base.yaml (WiFi, API, OTA) │ else
4. Create esphome/voice.yaml (Wyoming Satellite, wakeword) │ log_warn "Missing illustrations (found ${illust_count}, need 7)"
│ 5. Create esphome/display.yaml (LVGL face, 5 states) │ fi
│ 6. Create esphome/animations.yaml (face state scripts) │
│ 7. Create per-room configs (s3-box-living-room.yaml, etc.) │
│ 8. First flash via USB: esphome run esphome/<room>.yaml │
│ 9. Subsequent OTA: esphome upload esphome/<room>.yaml │
│ 10. Add to Home Assistant → assign Wyoming voice pipeline │
│ │
│ Quick flash (once esphome/ is ready): │
│ esphome run esphome/s3-box-living-room.yaml │
│ esphome logs esphome/s3-box-living-room.yaml │
└─────────────────────────────────────────────────────────────────┘
EOF # Wyoming services on Mac Mini
if curl -sf "http://localhost:10300" -o /dev/null 2>/dev/null || nc -z localhost 10300 2>/dev/null; then
log_ok "Wyoming STT (port 10300) reachable"
else
log_warn "Wyoming STT (port 10300) not reachable"
fi
log_info "P6 is not yet implemented. See homeai-esp32/PLAN.md for details." if curl -sf "http://localhost:10301" -o /dev/null 2>/dev/null || nc -z localhost 10301 2>/dev/null; then
exit 0 log_ok "Wyoming TTS (port 10301) reachable"
else
log_warn "Wyoming TTS (port 10301) not reachable"
fi
# Home Assistant
if curl -sk "https://10.0.0.199:8123" -o /dev/null 2>/dev/null; then
log_ok "Home Assistant (10.0.0.199:8123) reachable"
else
log_warn "Home Assistant not reachable — ESP32 won't be able to connect"
fi
if $ok; then
log_ok "Environment ready"
else
log_warn "Some issues found — fix before flashing"
fi
}
# ─── Commands ─────────────────────────────────────────────────────────────────
cmd_flash() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Compiling + flashing via USB: $(basename "${config}")"
log_info "First compile downloads ESP-IDF toolchain (~500MB), takes 5-10 min..."
cd "${ESPHOME_DIR}"
"${ESPHOME}" run "$(basename "${config}")"
}
cmd_ota() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Compiling + OTA upload: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" run "$(basename "${config}")"
}
cmd_logs() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Streaming logs for: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" logs "$(basename "${config}")"
}
cmd_validate() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Validating: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" config "$(basename "${config}")"
log_ok "Config valid"
}
# ─── Main ─────────────────────────────────────────────────────────────────────
case "${1:-}" in
flash)
check_env
echo ""
cmd_flash "${2:-}"
;;
ota)
check_env
echo ""
cmd_ota "${2:-}"
;;
logs)
cmd_logs "${2:-}"
;;
validate)
cmd_validate "${2:-}"
;;
*)
check_env
echo ""
echo "Usage: $0 {flash|ota|logs|validate} [config.yaml]"
echo ""
echo " flash Compile + flash via USB (first time)"
echo " ota Compile + flash via OTA (wireless, after first flash)"
echo " logs Stream device logs"
echo " validate Validate YAML config without compiling"
echo ""
echo "Default config: $(basename "${DEFAULT_CONFIG}")"
;;
esac

219
homeai-images/API_GUIDE.md Normal file
View File

@@ -0,0 +1,219 @@
# GAZE REST API Guide
## Setup
1. Open **Settings** in the GAZE web UI
2. Scroll to **REST API Key** and click **Regenerate**
3. Copy the key — you'll need it for all API requests
## Authentication
Every request must include your API key via one of:
- **Header (recommended):** `X-API-Key: <your-key>`
- **Query parameter:** `?api_key=<your-key>`
Responses for auth failures:
| Status | Meaning |
|--------|---------|
| `401` | Missing API key |
| `403` | Invalid API key |
## Endpoints
### List Presets
```
GET /api/v1/presets
```
Returns all available presets.
**Response:**
```json
{
"presets": [
{
"preset_id": "example_01",
"slug": "example_01",
"name": "Example Preset",
"has_cover": true
}
]
}
```
### Generate Image
```
POST /api/v1/generate/<preset_slug>
```
Queue one or more image generations using a preset's configuration. All body parameters are optional — when omitted, the preset's own settings are used.
**Request body (JSON):**
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `count` | int | `1` | Number of images to generate (120) |
| `checkpoint` | string | — | Override checkpoint path (e.g. `"Illustrious/model.safetensors"`) |
| `extra_positive` | string | `""` | Additional positive prompt tags appended to the generated prompt |
| `extra_negative` | string | `""` | Additional negative prompt tags |
| `seed` | int | random | Fixed seed for reproducible generation |
| `width` | int | — | Output width in pixels (must provide both width and height) |
| `height` | int | — | Output height in pixels (must provide both width and height) |
**Response (202):**
```json
{
"jobs": [
{ "job_id": "783f0268-ba85-4426-8ca2-6393c844c887", "status": "queued" }
]
}
```
**Errors:**
| Status | Cause |
|--------|-------|
| `400` | Invalid parameters (bad count, seed, or mismatched width/height) |
| `404` | Preset slug not found |
| `500` | Internal generation error |
### Check Job Status
```
GET /api/v1/job/<job_id>
```
Poll this endpoint to track generation progress.
**Response:**
```json
{
"id": "783f0268-ba85-4426-8ca2-6393c844c887",
"label": "Preset: Example Preset preview",
"status": "done",
"error": null,
"result": {
"image_url": "/static/uploads/presets/example_01/gen_1773601346.png",
"relative_path": "presets/example_01/gen_1773601346.png",
"seed": 927640517599332
}
}
```
**Job statuses:**
| Status | Meaning |
|--------|---------|
| `pending` | Waiting in queue |
| `processing` | Currently generating |
| `done` | Complete — `result` contains image info |
| `failed` | Error occurred — check `error` field |
The `result` object is only present when status is `done`. Use `seed` from the result to reproduce the exact same image later.
**Retrieving the image:** The `image_url` is a path relative to the server root. Fetch it directly:
```
GET http://<host>:5782/static/uploads/presets/example_01/gen_1773601346.png
```
Image retrieval does not require authentication.
## Examples
### Generate a single image and wait for it
```bash
API_KEY="your-key-here"
HOST="http://localhost:5782"
# Queue generation
JOB_ID=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{}' \
"$HOST/api/v1/generate/example_01" | python3 -c "import sys,json; print(json.load(sys.stdin)['jobs'][0]['job_id'])")
echo "Job: $JOB_ID"
# Poll until done
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" "$HOST/api/v1/job/$JOB_ID")
STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
echo "Status: $STATUS"
if [ "$STATUS" = "done" ] || [ "$STATUS" = "failed" ]; then
echo "$RESULT" | python3 -m json.tool
break
fi
sleep 5
done
```
### Generate 3 images with extra prompts
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"count": 3,
"extra_positive": "smiling, outdoors",
"extra_negative": "blurry"
}' \
"$HOST/api/v1/generate/example_01"
```
### Reproduce a specific image
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"seed": 927640517599332}' \
"$HOST/api/v1/generate/example_01"
```
### Python example
```python
import requests
import time
HOST = "http://localhost:5782"
API_KEY = "your-key-here"
HEADERS = {"X-API-Key": API_KEY, "Content-Type": "application/json"}
# List presets
presets = requests.get(f"{HOST}/api/v1/presets", headers=HEADERS).json()
print(f"Available presets: {[p['name'] for p in presets['presets']]}")
# Generate
resp = requests.post(
f"{HOST}/api/v1/generate/{presets['presets'][0]['slug']}",
headers=HEADERS,
json={"count": 1},
).json()
job_id = resp["jobs"][0]["job_id"]
print(f"Queued job: {job_id}")
# Poll
while True:
status = requests.get(f"{HOST}/api/v1/job/{job_id}", headers=HEADERS).json()
print(f"Status: {status['status']}")
if status["status"] in ("done", "failed"):
break
time.sleep(5)
if status["status"] == "done":
image_url = f"{HOST}{status['result']['image_url']}"
print(f"Image: {image_url}")
print(f"Seed: {status['result']['seed']}")
```

View File

@@ -12,17 +12,27 @@
<string>/Users/aodhan/gitea/homeai/homeai-llm/scripts/preload-models.sh</string> <string>/Users/aodhan/gitea/homeai/homeai-llm/scripts/preload-models.sh</string>
</array> </array>
<key>EnvironmentVariables</key>
<dict>
<!-- Override to change which medium model stays warm -->
<key>HOMEAI_MEDIUM_MODEL</key>
<string>qwen3.5:35b-a3b</string>
</dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>
<true/> <true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key> <key>StandardOutPath</key>
<string>/tmp/homeai-preload-models.log</string> <string>/tmp/homeai-preload-models.log</string>
<key>StandardErrorPath</key> <key>StandardErrorPath</key>
<string>/tmp/homeai-preload-models-error.log</string> <string>/tmp/homeai-preload-models-error.log</string>
<!-- Delay 15s to let Ollama start first --> <!-- If the script exits/crashes, wait 30s before restarting -->
<key>ThrottleInterval</key> <key>ThrottleInterval</key>
<integer>15</integer> <integer>30</integer>
</dict> </dict>
</plist> </plist>

View File

@@ -1,19 +1,73 @@
#!/bin/bash #!/bin/bash
# Pre-load voice pipeline models into Ollama with infinite keep_alive. # Keep voice pipeline models warm in Ollama VRAM.
# Run after Ollama starts (called by launchd or manually). # Runs as a loop — checks every 5 minutes, re-pins any model that got evicted.
# Only pins lightweight/MoE models — large dense models (70B) use default expiry. # Only pins lightweight/MoE models — large dense models (70B) use default expiry.
OLLAMA_URL="http://localhost:11434" OLLAMA_URL="http://localhost:11434"
CHECK_INTERVAL=300 # seconds between checks
# Wait for Ollama to be ready # Medium model can be overridden via env var (e.g. by persona config)
for i in $(seq 1 30); do HOMEAI_MEDIUM_MODEL="${HOMEAI_MEDIUM_MODEL:-qwen3.5:35b-a3b}"
curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && break
sleep 2 # Models to keep warm: "name|description"
MODELS=(
"qwen2.5:7b|small (4.7GB) — fast fallback"
"${HOMEAI_MEDIUM_MODEL}|medium — persona default"
)
wait_for_ollama() {
for i in $(seq 1 30); do
curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && return 0
sleep 2
done
return 1
}
is_model_loaded() {
local model="$1"
curl -sf "$OLLAMA_URL/api/ps" 2>/dev/null \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
names = [m['name'] for m in data.get('models', [])]
sys.exit(0 if '$model' in names else 1)
" 2>/dev/null
}
pin_model() {
local model="$1"
local desc="$2"
if is_model_loaded "$model"; then
echo "[keepwarm] $model already loaded — skipping"
return 0
fi
echo "[keepwarm] Loading $model ($desc) with keep_alive=-1..."
curl -sf "$OLLAMA_URL/api/generate" \
-d "{\"model\":\"$model\",\"prompt\":\"ready\",\"stream\":false,\"keep_alive\":-1,\"options\":{\"num_ctx\":512}}" \
> /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "[keepwarm] $model pinned in VRAM"
else
echo "[keepwarm] ERROR: failed to load $model"
fi
}
# --- Main loop ---
echo "[keepwarm] Starting model keep-warm daemon (interval: ${CHECK_INTERVAL}s)"
# Initial wait for Ollama
if ! wait_for_ollama; then
echo "[keepwarm] ERROR: Ollama not reachable after 60s, exiting"
exit 1
fi
echo "[keepwarm] Ollama is online"
while true; do
for entry in "${MODELS[@]}"; do
IFS='|' read -r model desc <<< "$entry"
pin_model "$model" "$desc"
done
sleep "$CHECK_INTERVAL"
done done
# Pin qwen3.5:35b-a3b (MoE, 38.7GB VRAM, voice pipeline default)
echo "[preload] Loading qwen3.5:35b-a3b with keep_alive=-1..."
curl -sf "$OLLAMA_URL/api/generate" \
-d '{"model":"qwen3.5:35b-a3b","prompt":"ready","stream":false,"keep_alive":-1,"options":{"num_ctx":512}}' \
> /dev/null 2>&1
echo "[preload] qwen3.5:35b-a3b pinned in memory"

159
homeai-rpi/deploy.sh Executable file
View File

@@ -0,0 +1,159 @@
#!/usr/bin/env bash
# homeai-rpi/deploy.sh — Deploy/manage Wyoming Satellite on Raspberry Pi from Mac Mini
#
# Usage:
# ./deploy.sh — full setup (push + install on Pi)
# ./deploy.sh --status — check satellite status
# ./deploy.sh --restart — restart satellite service
# ./deploy.sh --logs — tail satellite logs
# ./deploy.sh --test-audio — record 3s from mic, play back through speaker
# ./deploy.sh --update — update Python packages only
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# ─── Pi connection ──────────────────────────────────────────────────────────
PI_HOST="SELBINA.local"
PI_USER="aodhan"
PI_SSH="${PI_USER}@${PI_HOST}"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; exit 1; }
log_step() { echo -e "${CYAN}[STEP]${NC} $*"; }
# ─── SSH helpers ────────────────────────────────────────────────────────────
pi_ssh() {
ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=accept-new "${PI_SSH}" "$@"
}
pi_scp() {
scp -o ConnectTimeout=5 -o StrictHostKeyChecking=accept-new "$@"
}
check_connectivity() {
log_step "Checking connectivity to ${PI_HOST}..."
if ! ping -c 1 -t 3 "${PI_HOST}" &>/dev/null; then
log_error "Cannot reach ${PI_HOST}. Is the Pi on?"
fi
if ! pi_ssh "echo ok" &>/dev/null; then
log_error "SSH to ${PI_SSH} failed. Set up SSH keys:
ssh-copy-id ${PI_SSH}"
fi
log_ok "Connected to ${PI_HOST}"
}
# ─── Commands ───────────────────────────────────────────────────────────────
cmd_setup() {
check_connectivity
log_step "Pushing setup script to Pi..."
pi_scp "${SCRIPT_DIR}/setup.sh" "${PI_SSH}:~/homeai-satellite-setup.sh"
log_step "Running setup on Pi..."
pi_ssh "chmod +x ~/homeai-satellite-setup.sh && ~/homeai-satellite-setup.sh"
log_ok "Setup complete!"
}
cmd_status() {
check_connectivity
log_step "Satellite status:"
pi_ssh "systemctl status homeai-satellite.service --no-pager" || true
}
cmd_restart() {
check_connectivity
log_step "Restarting satellite..."
pi_ssh "sudo systemctl restart homeai-satellite.service"
sleep 2
pi_ssh "systemctl is-active homeai-satellite.service" && log_ok "Satellite running" || log_warn "Satellite not active"
}
cmd_logs() {
check_connectivity
log_info "Tailing satellite logs (Ctrl+C to stop)..."
pi_ssh "journalctl -u homeai-satellite.service -f --no-hostname"
}
cmd_test_audio() {
check_connectivity
log_step "Recording 3 seconds from mic..."
pi_ssh "arecord -D plughw:2,0 -d 3 -f S16_LE -r 16000 -c 1 /tmp/homeai-test.wav 2>/dev/null"
log_step "Playing back through speaker..."
pi_ssh "aplay -D plughw:2,0 /tmp/homeai-test.wav 2>/dev/null"
log_ok "Audio test complete. Did you hear yourself?"
}
cmd_update() {
check_connectivity
log_step "Updating Python packages on Pi..."
pi_ssh "source ~/homeai-satellite/venv/bin/activate && pip install --upgrade wyoming-satellite openwakeword -q"
log_step "Pushing latest scripts..."
pi_scp "${SCRIPT_DIR}/satellite_wrapper.py" "${PI_SSH}:~/homeai-satellite/satellite_wrapper.py"
pi_ssh "sudo systemctl restart homeai-satellite.service"
log_ok "Updated and restarted"
}
cmd_push_wrapper() {
check_connectivity
log_step "Pushing satellite_wrapper.py..."
pi_scp "${SCRIPT_DIR}/satellite_wrapper.py" "${PI_SSH}:~/homeai-satellite/satellite_wrapper.py"
log_step "Restarting satellite..."
pi_ssh "sudo systemctl restart homeai-satellite.service"
sleep 2
pi_ssh "systemctl is-active homeai-satellite.service" && log_ok "Satellite running" || log_warn "Satellite not active — check logs"
}
cmd_test_logs() {
check_connectivity
log_info "Filtered satellite logs — key events only (Ctrl+C to stop)..."
pi_ssh "journalctl -u homeai-satellite.service -f --no-hostname" \
| grep --line-buffered -iE \
'Waiting for wake|Streaming audio|transcript|synthesize|Speaker active|unmute|_writer|timeout|error|Error|Wake word detected|re-arming|resetting'
}
# ─── Main ───────────────────────────────────────────────────────────────────
case "${1:-}" in
--status) cmd_status ;;
--restart) cmd_restart ;;
--logs) cmd_logs ;;
--test-audio) cmd_test_audio ;;
--test-logs) cmd_test_logs ;;
--update) cmd_update ;;
--push-wrapper) cmd_push_wrapper ;;
--help|-h)
echo "Usage: $0 [command]"
echo ""
echo "Commands:"
echo " (none) Full setup — push and install satellite on Pi"
echo " --status Check satellite service status"
echo " --restart Restart satellite service"
echo " --logs Tail satellite logs (live, all)"
echo " --test-logs Tail filtered logs (key events only)"
echo " --test-audio Record 3s from mic, play back on speaker"
echo " --push-wrapper Push satellite_wrapper.py and restart (fast iteration)"
echo " --update Update packages and restart"
echo " --help Show this help"
echo ""
echo "Pi: ${PI_SSH} (${PI_HOST})"
;;
"") cmd_setup ;;
*) log_error "Unknown command: $1. Use --help for usage." ;;
esac

View File

@@ -0,0 +1,203 @@
#!/usr/bin/env python3
"""Wyoming Satellite wrapper — echo suppression, writer resilience, streaming timeout.
Monkey-patches WakeStreamingSatellite to fix three compounding bugs that cause
the satellite to freeze after the first voice command:
1. TTS Echo: Mic picks up speaker audio → false wake word trigger → Whisper
hallucinates on silence. Fix: mute mic→wake forwarding while speaker is active.
2. Server Writer Race: HA disconnects after first command, _writer becomes None.
If wake word fires before HA reconnects, _send_run_pipeline() silently drops
the event → satellite stuck in is_streaming=True forever.
Fix: check _writer before entering streaming mode; re-arm wake if no server.
3. No Streaming Timeout: Once stuck in streaming mode, there's no recovery.
Fix: auto-reset after 30s if no Transcript arrives.
4. Error events don't reset streaming state in upstream code.
Fix: reset is_streaming on Error events from server.
Usage: python3 satellite_wrapper.py <same args as wyoming_satellite>
"""
import asyncio
import logging
import time
from wyoming.audio import AudioChunk, AudioStart, AudioStop
from wyoming.error import Error
from wyoming.wake import Detection
from wyoming_satellite.satellite import WakeStreamingSatellite
_LOGGER = logging.getLogger()
# ─── Tuning constants ────────────────────────────────────────────────────────
# How long to keep wake muted after the last AudioStop from the server.
# Must be long enough for sox→aplay buffer to drain (~1-2s) plus audio decay.
_GRACE_SECONDS = 5.0
# Safety valve — unmute even if no AudioStop arrives (e.g. long TTS response).
_MAX_MUTE_SECONDS = 45.0
# Max time in streaming mode without receiving a Transcript or Error.
# Prevents permanent freeze if server never responds.
_STREAMING_TIMEOUT = 30.0
# ─── Save original methods ───────────────────────────────────────────────────
_orig_event_from_server = WakeStreamingSatellite.event_from_server
_orig_event_from_mic = WakeStreamingSatellite.event_from_mic
_orig_event_from_wake = WakeStreamingSatellite.event_from_wake
_orig_trigger_detection = WakeStreamingSatellite.trigger_detection
_orig_trigger_transcript = WakeStreamingSatellite.trigger_transcript
# ─── Patch A: Mute wake on awake.wav ─────────────────────────────────────────
async def _patched_trigger_detection(self, detection):
"""Mute wake word detection when awake.wav starts playing."""
self._speaker_mute_start = time.monotonic()
self._speaker_active = True
_LOGGER.debug("Speaker active (awake.wav) — wake detection muted")
await _orig_trigger_detection(self, detection)
# ─── Patch B: Mute wake on done.wav ──────────────────────────────────────────
async def _patched_trigger_transcript(self, transcript):
"""Keep muted through done.wav playback."""
self._speaker_active = True
_LOGGER.debug("Speaker active (done.wav) — wake detection muted")
await _orig_trigger_transcript(self, transcript)
# ─── Patch C: Echo tracking + error recovery ─────────────────────────────────
async def _patched_event_from_server(self, event):
"""Track TTS audio for echo suppression; reset streaming on errors."""
# Echo suppression: track when speaker is active
if AudioStart.is_type(event.type):
self._speaker_active = True
self._speaker_mute_start = time.monotonic()
_LOGGER.debug("Speaker active (TTS) — wake detection muted")
elif AudioStop.is_type(event.type):
self._speaker_unmute_at = time.monotonic() + _GRACE_SECONDS
_LOGGER.debug(
"TTS finished — will unmute wake in %.1fs", _GRACE_SECONDS
)
# Error recovery: reset streaming state if server reports an error
if Error.is_type(event.type) and self.is_streaming:
_LOGGER.warning("Error from server while streaming — resetting")
self.is_streaming = False
# Call original handler (plays done.wav, forwards TTS audio, etc.)
await _orig_event_from_server(self, event)
# After original handler: if Error arrived, re-arm wake detection
if Error.is_type(event.type) and not self.is_streaming:
await self.trigger_streaming_stop()
await self._send_wake_detect()
_LOGGER.info("Waiting for wake word (after error)")
# ─── Patch D: Echo suppression + streaming timeout ───────────────────────────
async def _patched_event_from_mic(self, event, audio_bytes=None):
"""Drop mic audio during speaker playback; timeout stuck streaming."""
# --- Streaming timeout ---
if self.is_streaming:
elapsed = time.monotonic() - getattr(self, "_streaming_start_time", 0)
if elapsed > _STREAMING_TIMEOUT:
_LOGGER.warning(
"Streaming timeout (%.0fs) — no Transcript received, resetting",
elapsed,
)
self.is_streaming = False
# Tell server we're done sending audio
await self.event_to_server(AudioStop().event())
await self.trigger_streaming_stop()
await self._send_wake_detect()
_LOGGER.info("Waiting for wake word (after timeout)")
return
# --- Echo suppression ---
if getattr(self, "_speaker_active", False) and not self.is_streaming:
now = time.monotonic()
# Check if grace period has elapsed after AudioStop
unmute_at = getattr(self, "_speaker_unmute_at", None)
if unmute_at and now >= unmute_at:
self._speaker_active = False
self._speaker_unmute_at = None
_LOGGER.debug("Wake detection unmuted (grace period elapsed)")
# Safety valve — don't stay muted forever
elif now - getattr(self, "_speaker_mute_start", now) > _MAX_MUTE_SECONDS:
self._speaker_active = False
self._speaker_unmute_at = None
_LOGGER.warning("Wake detection force-unmuted (max mute timeout)")
elif AudioChunk.is_type(event.type):
# Drop this mic chunk — don't feed speaker audio to wake word
return
await _orig_event_from_mic(self, event, audio_bytes)
# ─── Patch E: Writer check before streaming (THE CRITICAL FIX) ───────────────
async def _patched_event_from_wake(self, event):
"""Check server connection before entering streaming mode."""
if self.is_streaming:
return
if Detection.is_type(event.type):
# THE FIX: If no server connection, don't enter streaming mode.
# Without this, _send_run_pipeline() silently drops the RunPipeline
# event, and the satellite is stuck in is_streaming=True forever.
if self._writer is None:
_LOGGER.warning(
"Wake word detected but no server connection — re-arming"
)
await self._send_wake_detect()
return
self.is_streaming = True
self._streaming_start_time = time.monotonic()
_LOGGER.debug("Streaming audio")
await self._send_run_pipeline()
await self.forward_event(event)
await self.trigger_detection(Detection.from_event(event))
await self.trigger_streaming_start()
# ─── Apply patches ───────────────────────────────────────────────────────────
WakeStreamingSatellite.event_from_server = _patched_event_from_server
WakeStreamingSatellite.event_from_mic = _patched_event_from_mic
WakeStreamingSatellite.event_from_wake = _patched_event_from_wake
WakeStreamingSatellite.trigger_detection = _patched_trigger_detection
WakeStreamingSatellite.trigger_transcript = _patched_trigger_transcript
# Instance attributes (set as class defaults so they exist before __init__)
WakeStreamingSatellite._speaker_active = False
WakeStreamingSatellite._speaker_unmute_at = None
WakeStreamingSatellite._speaker_mute_start = 0.0
WakeStreamingSatellite._streaming_start_time = 0.0
# ─── Run the original main ───────────────────────────────────────────────────
if __name__ == "__main__":
from wyoming_satellite.__main__ import main
try:
asyncio.run(main())
except KeyboardInterrupt:
pass

527
homeai-rpi/setup.sh Executable file
View File

@@ -0,0 +1,527 @@
#!/usr/bin/env bash
# homeai-rpi/setup.sh — Bootstrap a Raspberry Pi as a Wyoming Satellite
#
# Run this ON the Pi (or push via deploy.sh from Mac Mini):
# curl -sL http://10.0.0.101:3000/aodhan/homeai/raw/branch/main/homeai-rpi/setup.sh | bash
# — or —
# ./setup.sh
#
# Prerequisites:
# - Raspberry Pi 5 with Raspberry Pi OS (Bookworm)
# - ReSpeaker 2-Mics pHAT installed and driver loaded (card shows in aplay -l)
# - Network connectivity to Mac Mini (10.0.0.101)
set -euo pipefail
# ─── Configuration ──────────────────────────────────────────────────────────
SATELLITE_NAME="homeai-kitchen"
SATELLITE_AREA="Kitchen"
MAC_MINI_IP="10.0.0.101"
# ReSpeaker 2-Mics pHAT — card 2 on Pi 5
# Using plughw for automatic format conversion (sample rate, channels)
MIC_DEVICE="plughw:2,0"
SPK_DEVICE="plughw:2,0"
# Wyoming satellite port (unique per satellite if running multiple)
SATELLITE_PORT="10700"
# Directories
INSTALL_DIR="${HOME}/homeai-satellite"
VENV_DIR="${INSTALL_DIR}/venv"
SOUNDS_DIR="${INSTALL_DIR}/sounds"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; exit 1; }
log_step() { echo -e "${CYAN}[STEP]${NC} $*"; }
# ─── Preflight checks ──────────────────────────────────────────────────────
log_step "Preflight checks..."
# Check we're on a Pi
if ! grep -qi "raspberry\|bcm" /proc/cpuinfo 2>/dev/null; then
log_warn "This doesn't look like a Raspberry Pi — proceeding anyway"
fi
# Check ReSpeaker is available
if ! aplay -l 2>/dev/null | grep -q "seeed-2mic-voicecard"; then
log_error "ReSpeaker 2-Mics pHAT not found in aplay -l. Is the driver loaded?"
fi
log_ok "ReSpeaker 2-Mics pHAT detected"
# Check Python 3
if ! command -v python3 &>/dev/null; then
log_error "python3 not found. Install with: sudo apt install python3 python3-venv python3-pip"
fi
log_ok "Python $(python3 --version | cut -d' ' -f2)"
# ─── Install system dependencies ───────────────────────────────────────────
log_step "Installing system dependencies..."
sudo apt-get update -qq
# Allow non-zero exit — pre-existing DKMS/kernel issues (e.g. seeed-voicecard
# failing to build against a pending kernel update) can cause apt to return
# errors even though our packages installed successfully.
sudo apt-get install -y -qq \
python3-venv \
python3-pip \
alsa-utils \
sox \
libsox-fmt-all \
libopenblas0 \
2>/dev/null || log_warn "apt-get returned errors (likely pre-existing kernel/DKMS issue — continuing)"
# Verify the packages we actually need are present
for cmd in sox arecord aplay; do
command -v "$cmd" &>/dev/null || log_error "${cmd} not found after install"
done
log_ok "System dependencies installed"
# ─── Create install directory ───────────────────────────────────────────────
log_step "Setting up ${INSTALL_DIR}..."
mkdir -p "${INSTALL_DIR}" "${SOUNDS_DIR}"
# ─── Create Python venv ────────────────────────────────────────────────────
if [[ ! -d "${VENV_DIR}" ]]; then
log_step "Creating Python virtual environment..."
python3 -m venv "${VENV_DIR}"
fi
source "${VENV_DIR}/bin/activate"
pip install --upgrade pip setuptools wheel -q
# ─── Install Wyoming Satellite + openWakeWord ──────────────────────────────
log_step "Installing Wyoming Satellite..."
pip install wyoming-satellite -q
log_step "Installing openWakeWord..."
pip install openwakeword -q
log_step "Installing numpy..."
pip install numpy -q
log_ok "All Python packages installed"
# ─── Copy wakeword command script ──────────────────────────────────────────
log_step "Installing wake word detection script..."
cat > "${INSTALL_DIR}/wakeword_command.py" << 'PYEOF'
#!/usr/bin/env python3
"""Wake word detection command for Wyoming Satellite.
The satellite feeds raw 16kHz 16-bit mono audio via stdin.
This script reads that audio, runs openWakeWord, and prints
the wake word name to stdout when detected.
Usage (called by wyoming-satellite --wake-command):
python wakeword_command.py [--wake-word hey_jarvis] [--threshold 0.3]
"""
import argparse
import sys
import numpy as np
import logging
_LOGGER = logging.getLogger(__name__)
SAMPLE_RATE = 16000
CHUNK_SIZE = 1280 # ~80ms at 16kHz — recommended by openWakeWord
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--wake-word", default="hey_jarvis")
parser.add_argument("--threshold", type=float, default=0.5)
parser.add_argument("--cooldown", type=float, default=3.0)
parser.add_argument("--debug", action="store_true")
args = parser.parse_args()
logging.basicConfig(
level=logging.DEBUG if args.debug else logging.WARNING,
format="%(asctime)s %(levelname)s %(message)s",
stream=sys.stderr,
)
import openwakeword
from openwakeword.model import Model
oww = Model(
wakeword_models=[args.wake_word],
inference_framework="onnx",
)
import time
last_trigger = 0.0
bytes_per_chunk = CHUNK_SIZE * 2 # 16-bit = 2 bytes per sample
_LOGGER.debug("Wake word command ready, reading audio from stdin")
try:
while True:
raw = sys.stdin.buffer.read(bytes_per_chunk)
if not raw:
break
if len(raw) < bytes_per_chunk:
raw = raw + b'\x00' * (bytes_per_chunk - len(raw))
chunk = np.frombuffer(raw, dtype=np.int16)
oww.predict(chunk)
for ww, scores in oww.prediction_buffer.items():
score = scores[-1] if scores else 0.0
if score >= args.threshold:
now = time.time()
if now - last_trigger >= args.cooldown:
last_trigger = now
print(ww, flush=True)
_LOGGER.debug("Wake word detected: %s (score=%.3f)", ww, score)
except (KeyboardInterrupt, BrokenPipeError):
pass
if __name__ == "__main__":
main()
PYEOF
chmod +x "${INSTALL_DIR}/wakeword_command.py"
log_ok "Wake word script installed"
# ─── Copy satellite wrapper ──────────────────────────────────────────────
log_step "Installing satellite wrapper (echo suppression + writer resilience)..."
cat > "${INSTALL_DIR}/satellite_wrapper.py" << 'WRAPEOF'
#!/usr/bin/env python3
"""Wyoming Satellite wrapper — echo suppression, writer resilience, streaming timeout.
Monkey-patches WakeStreamingSatellite to fix three compounding bugs that cause
the satellite to freeze after the first voice command:
1. TTS Echo: Mic picks up speaker audio → false wake word trigger
2. Server Writer Race: _writer is None when wake word fires → silent drop
3. No Streaming Timeout: stuck in is_streaming=True forever
4. Error events don't reset streaming state in upstream code
"""
import asyncio
import logging
import time
from wyoming.audio import AudioChunk, AudioStart, AudioStop
from wyoming.error import Error
from wyoming.wake import Detection
from wyoming_satellite.satellite import WakeStreamingSatellite
_LOGGER = logging.getLogger()
_GRACE_SECONDS = 5.0
_MAX_MUTE_SECONDS = 45.0
_STREAMING_TIMEOUT = 30.0
_orig_event_from_server = WakeStreamingSatellite.event_from_server
_orig_event_from_mic = WakeStreamingSatellite.event_from_mic
_orig_event_from_wake = WakeStreamingSatellite.event_from_wake
_orig_trigger_detection = WakeStreamingSatellite.trigger_detection
_orig_trigger_transcript = WakeStreamingSatellite.trigger_transcript
async def _patched_trigger_detection(self, detection):
self._speaker_mute_start = time.monotonic()
self._speaker_active = True
_LOGGER.debug("Speaker active (awake.wav) — wake detection muted")
await _orig_trigger_detection(self, detection)
async def _patched_trigger_transcript(self, transcript):
self._speaker_active = True
_LOGGER.debug("Speaker active (done.wav) — wake detection muted")
await _orig_trigger_transcript(self, transcript)
async def _patched_event_from_server(self, event):
if AudioStart.is_type(event.type):
self._speaker_active = True
self._speaker_mute_start = time.monotonic()
_LOGGER.debug("Speaker active (TTS) — wake detection muted")
elif AudioStop.is_type(event.type):
self._speaker_unmute_at = time.monotonic() + _GRACE_SECONDS
_LOGGER.debug("TTS finished — will unmute wake in %.1fs", _GRACE_SECONDS)
if Error.is_type(event.type) and self.is_streaming:
_LOGGER.warning("Error from server while streaming — resetting")
self.is_streaming = False
await _orig_event_from_server(self, event)
if Error.is_type(event.type) and not self.is_streaming:
await self.trigger_streaming_stop()
await self._send_wake_detect()
_LOGGER.info("Waiting for wake word (after error)")
async def _patched_event_from_mic(self, event, audio_bytes=None):
if self.is_streaming:
elapsed = time.monotonic() - getattr(self, "_streaming_start_time", 0)
if elapsed > _STREAMING_TIMEOUT:
_LOGGER.warning(
"Streaming timeout (%.0fs) — no Transcript received, resetting",
elapsed,
)
self.is_streaming = False
await self.event_to_server(AudioStop().event())
await self.trigger_streaming_stop()
await self._send_wake_detect()
_LOGGER.info("Waiting for wake word (after timeout)")
return
if getattr(self, "_speaker_active", False) and not self.is_streaming:
now = time.monotonic()
unmute_at = getattr(self, "_speaker_unmute_at", None)
if unmute_at and now >= unmute_at:
self._speaker_active = False
self._speaker_unmute_at = None
_LOGGER.debug("Wake detection unmuted (grace period elapsed)")
elif now - getattr(self, "_speaker_mute_start", now) > _MAX_MUTE_SECONDS:
self._speaker_active = False
self._speaker_unmute_at = None
_LOGGER.warning("Wake detection force-unmuted (max mute timeout)")
elif AudioChunk.is_type(event.type):
return
await _orig_event_from_mic(self, event, audio_bytes)
async def _patched_event_from_wake(self, event):
if self.is_streaming:
return
if Detection.is_type(event.type):
if self._writer is None:
_LOGGER.warning(
"Wake word detected but no server connection — re-arming"
)
await self._send_wake_detect()
return
self.is_streaming = True
self._streaming_start_time = time.monotonic()
_LOGGER.debug("Streaming audio")
await self._send_run_pipeline()
await self.forward_event(event)
await self.trigger_detection(Detection.from_event(event))
await self.trigger_streaming_start()
WakeStreamingSatellite.event_from_server = _patched_event_from_server
WakeStreamingSatellite.event_from_mic = _patched_event_from_mic
WakeStreamingSatellite.event_from_wake = _patched_event_from_wake
WakeStreamingSatellite.trigger_detection = _patched_trigger_detection
WakeStreamingSatellite.trigger_transcript = _patched_trigger_transcript
WakeStreamingSatellite._speaker_active = False
WakeStreamingSatellite._speaker_unmute_at = None
WakeStreamingSatellite._speaker_mute_start = 0.0
WakeStreamingSatellite._streaming_start_time = 0.0
if __name__ == "__main__":
from wyoming_satellite.__main__ import main
try:
asyncio.run(main())
except KeyboardInterrupt:
pass
WRAPEOF
chmod +x "${INSTALL_DIR}/satellite_wrapper.py"
log_ok "Satellite wrapper installed"
# ─── Download wake word model ──────────────────────────────────────────────
log_step "Downloading hey_jarvis wake word model..."
"${VENV_DIR}/bin/python3" -c "
import openwakeword
openwakeword.utils.download_models(model_names=['hey_jarvis'])
print('Model downloaded')
" 2>&1 | grep -v "device_discovery"
log_ok "Wake word model ready"
# ─── Create mic capture wrapper ────────────────────────────────────────────
log_step "Creating mic capture wrapper (stereo → mono conversion)..."
cat > "${INSTALL_DIR}/mic-capture.sh" << 'MICEOF'
#!/bin/bash
# Record stereo from ReSpeaker WM8960, convert to mono 16kHz 16-bit for Wyoming
arecord -D plughw:2,0 -r 16000 -c 2 -f S16_LE -t raw -q - | sox -t raw -r 16000 -c 2 -b 16 -e signed-integer - -t raw -r 16000 -c 1 -b 16 -e signed-integer -
MICEOF
chmod +x "${INSTALL_DIR}/mic-capture.sh"
log_ok "Mic capture wrapper installed"
# ─── Create speaker playback wrapper ──────────────────────────────────────
log_step "Creating speaker playback wrapper (mono → stereo conversion)..."
cat > "${INSTALL_DIR}/speaker-playback.sh" << 'SPKEOF'
#!/bin/bash
# Convert mono 24kHz 16-bit input to stereo for WM8960 playback
sox -t raw -r 24000 -c 1 -b 16 -e signed-integer - -t raw -r 24000 -c 2 -b 16 -e signed-integer - | aplay -D plughw:2,0 -r 24000 -c 2 -f S16_LE -t raw -q -
SPKEOF
chmod +x "${INSTALL_DIR}/speaker-playback.sh"
log_ok "Speaker playback wrapper installed"
# ─── Fix ReSpeaker overlay for Pi 5 ────────────────────────────────────────
log_step "Configuring wm8960-soundcard overlay (Pi 5 compatible)..."
# Disable the seeed-voicecard service (loads wrong overlay for Pi 5)
if systemctl is-enabled seeed-voicecard.service &>/dev/null; then
sudo systemctl disable seeed-voicecard.service 2>/dev/null || true
log_info "Disabled seeed-voicecard service"
fi
# Add upstream wm8960-soundcard overlay to config.txt if not present
if ! grep -q "dtoverlay=wm8960-soundcard" /boot/firmware/config.txt 2>/dev/null; then
sudo bash -c 'echo "dtoverlay=wm8960-soundcard" >> /boot/firmware/config.txt'
log_info "Added wm8960-soundcard overlay to /boot/firmware/config.txt"
fi
# Load overlay now if not already active
if ! dtoverlay -l 2>/dev/null | grep -q wm8960-soundcard; then
sudo dtoverlay -r seeed-2mic-voicecard 2>/dev/null || true
sudo dtoverlay wm8960-soundcard 2>/dev/null || true
fi
log_ok "Audio overlay configured"
# ─── Generate feedback sounds ──────────────────────────────────────────────
log_step "Generating feedback sounds..."
# Must be plain 16-bit PCM WAV — Python wave module can't read WAVE_FORMAT_EXTENSIBLE
# Awake chime — short rising tone
sox -n -r 16000 -b 16 -c 1 -e signed-integer "${SOUNDS_DIR}/awake.wav" \
synth 0.15 sin 800 fade t 0.01 0.15 0.05 \
vol 0.5 \
2>/dev/null || log_warn "Could not generate awake.wav (sox issue)"
# Done chime — short falling tone
sox -n -r 16000 -b 16 -c 1 -e signed-integer "${SOUNDS_DIR}/done.wav" \
synth 0.15 sin 600 fade t 0.01 0.15 0.05 \
vol 0.5 \
2>/dev/null || log_warn "Could not generate done.wav (sox issue)"
log_ok "Feedback sounds ready"
# ─── Set ALSA mixer defaults ───────────────────────────────────────────────
log_step "Configuring ALSA mixer for ReSpeaker..."
# Playback — 80% volume, unmute
amixer -c 2 sset 'Playback' 80% unmute 2>/dev/null || true
amixer -c 2 sset 'Speaker' 80% unmute 2>/dev/null || true
# Capture — max out capture volume
amixer -c 2 sset 'Capture' 100% cap 2>/dev/null || true
# Enable mic input boost (critical — without this, signal is near-silent)
amixer -c 2 cset name='Left Input Mixer Boost Switch' on 2>/dev/null || true
amixer -c 2 cset name='Right Input Mixer Boost Switch' on 2>/dev/null || true
# Mic preamp boost to +13dB (1 of 3 — higher causes clipping)
amixer -c 2 cset name='Left Input Boost Mixer LINPUT1 Volume' 1 2>/dev/null || true
amixer -c 2 cset name='Right Input Boost Mixer RINPUT1 Volume' 1 2>/dev/null || true
# ADC capture volume — moderate to avoid clipping (max=255)
amixer -c 2 cset name='ADC PCM Capture Volume' 180,180 2>/dev/null || true
log_ok "ALSA mixer configured"
# ─── Install systemd service ───────────────────────────────────────────────
log_step "Installing systemd service..."
sudo tee /etc/systemd/system/homeai-satellite.service > /dev/null << SVCEOF
[Unit]
Description=HomeAI Wyoming Satellite (${SATELLITE_AREA})
After=network-online.target sound.target
Wants=network-online.target
[Service]
Type=simple
User=${USER}
WorkingDirectory=${INSTALL_DIR}
ExecStart=${VENV_DIR}/bin/python3 ${INSTALL_DIR}/satellite_wrapper.py \\
--uri tcp://0.0.0.0:${SATELLITE_PORT} \\
--name "${SATELLITE_NAME}" \\
--area "${SATELLITE_AREA}" \\
--mic-command ${INSTALL_DIR}/mic-capture.sh \\
--snd-command ${INSTALL_DIR}/speaker-playback.sh \\
--mic-command-rate 16000 \\
--mic-command-width 2 \\
--mic-command-channels 1 \\
--snd-command-rate 24000 \\
--snd-command-width 2 \\
--snd-command-channels 1 \\
--wake-command "${VENV_DIR}/bin/python3 ${INSTALL_DIR}/wakeword_command.py --wake-word hey_jarvis --threshold 0.5" \\
--wake-command-rate 16000 \\
--wake-command-width 2 \\
--wake-command-channels 1 \\
--awake-wav ${SOUNDS_DIR}/awake.wav \\
--done-wav ${SOUNDS_DIR}/done.wav
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
SVCEOF
sudo systemctl daemon-reload
sudo systemctl enable homeai-satellite.service
sudo systemctl restart homeai-satellite.service
log_ok "systemd service installed and started"
# ─── Verify ────────────────────────────────────────────────────────────────
log_step "Verifying satellite..."
sleep 2
if systemctl is-active --quiet homeai-satellite.service; then
log_ok "Satellite is running!"
else
log_warn "Satellite may not have started cleanly. Check logs:"
echo " journalctl -u homeai-satellite.service -f"
fi
echo ""
echo -e "${GREEN}═══════════════════════════════════════════════════════════════${NC}"
echo -e "${GREEN} HomeAI Kitchen Satellite — Setup Complete${NC}"
echo -e "${GREEN}═══════════════════════════════════════════════════════════════${NC}"
echo ""
echo " Satellite: ${SATELLITE_NAME} (${SATELLITE_AREA})"
echo " Port: ${SATELLITE_PORT}"
echo " Mic: ${MIC_DEVICE} (ReSpeaker 2-Mics)"
echo " Speaker: ${SPK_DEVICE} (ReSpeaker 3.5mm)"
echo " Wake word: hey_jarvis"
echo ""
echo " Next steps:"
echo " 1. In Home Assistant, go to Settings → Devices & Services → Add Integration"
echo " 2. Search for 'Wyoming Protocol'"
echo " 3. Enter host: $(hostname -I | awk '{print $1}') port: ${SATELLITE_PORT}"
echo " 4. Assign the HomeAI voice pipeline to this satellite"
echo ""
echo " Useful commands:"
echo " journalctl -u homeai-satellite.service -f # live logs"
echo " sudo systemctl restart homeai-satellite # restart"
echo " sudo systemctl status homeai-satellite # status"
echo " arecord -D ${MIC_DEVICE} -d 3 -f S16_LE -r 16000 /tmp/test.wav # test mic"
echo " aplay -D ${SPK_DEVICE} /tmp/test.wav # test speaker"
echo ""

3
homeai-rpi/speaker-playback.sh Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/bash
# Convert mono 24kHz 16-bit input to stereo for WM8960 playback
sox -t raw -r 24000 -c 1 -b 16 -e signed-integer - -t raw -r 24000 -c 2 -b 16 -e signed-integer - | aplay -D plughw:2,0 -r 24000 -c 2 -f S16_LE -t raw -q -

View File

@@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.vtube-bridge</string>
<key>ProgramArguments</key>
<array>
<string>/Users/aodhan/homeai-visual-env/bin/python3</string>
<string>/Users/aodhan/gitea/homeai/homeai-visual/vtube-bridge.py</string>
<string>--port</string>
<string>8002</string>
<string>--character</string>
<string>/Users/aodhan/gitea/homeai/homeai-dashboard/characters/aria.json</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-vtube-bridge.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-vtube-bridge-error.log</string>
<key>ThrottleInterval</key>
<integer>10</integer>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
</dict>
</dict>
</plist>

View File

@@ -0,0 +1,170 @@
#!/usr/bin/env python3
"""
Test script for VTube Studio Expression Bridge.
Usage:
python3 test-expressions.py # test all expressions
python3 test-expressions.py --auth # run auth flow first
python3 test-expressions.py --lipsync # test lip sync parameter
python3 test-expressions.py --latency # measure round-trip latency
Requires the vtube-bridge to be running on port 8002.
"""
import argparse
import json
import sys
import time
import urllib.request
BRIDGE_URL = "http://localhost:8002"
EXPRESSIONS = ["idle", "listening", "thinking", "speaking", "happy", "sad", "surprised", "error"]
def _post(path: str, data: dict | None = None) -> dict:
body = json.dumps(data or {}).encode()
req = urllib.request.Request(
f"{BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def _get(path: str) -> dict:
req = urllib.request.Request(f"{BRIDGE_URL}{path}")
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def check_bridge():
"""Verify bridge is running and connected."""
try:
status = _get("/status")
print(f"Bridge status: connected={status['connected']}, authenticated={status['authenticated']}")
print(f" Expressions: {', '.join(status.get('expressions', []))}")
if not status["connected"]:
print("\n WARNING: Not connected to VTube Studio. Is it running?")
if not status["authenticated"]:
print(" WARNING: Not authenticated. Run with --auth to initiate auth flow.")
return status
except Exception as e:
print(f"ERROR: Cannot reach bridge at {BRIDGE_URL}: {e}")
print(" Is vtube-bridge.py running?")
sys.exit(1)
def run_auth():
"""Initiate auth flow — user must click Allow in VTube Studio."""
print("Requesting authentication token...")
print(" >>> Click 'Allow' in VTube Studio when prompted <<<")
result = _post("/auth")
print(f" Result: {json.dumps(result, indent=2)}")
return result
def test_expressions(delay: float = 2.0):
"""Cycle through all expressions with a pause between each."""
print(f"\nCycling through {len(EXPRESSIONS)} expressions ({delay}s each):\n")
for expr in EXPRESSIONS:
print(f"{expr}...", end=" ", flush=True)
t0 = time.monotonic()
result = _post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
if result.get("ok"):
print(f"OK ({dt:.0f}ms)")
else:
print(f"FAILED: {result.get('error', 'unknown')}")
time.sleep(delay)
# Return to idle
_post("/expression", {"event": "idle"})
print("\n Returned to idle.")
def test_lipsync(duration: float = 3.0):
"""Simulate lip sync by sweeping MouthOpen 0→1→0."""
import math
print(f"\nTesting lip sync (MouthOpen sweep, {duration}s)...\n")
fps = 20
frames = int(duration * fps)
for i in range(frames):
t = i / frames
# Sine wave for smooth open/close
value = abs(math.sin(t * math.pi * 4))
value = round(value, 3)
_post("/parameter", {"name": "MouthOpen", "value": value})
print(f"\r MouthOpen = {value:.3f}", end="", flush=True)
time.sleep(1.0 / fps)
_post("/parameter", {"name": "MouthOpen", "value": 0.0})
print("\r MouthOpen = 0.000 (done) ")
def test_latency(iterations: int = 20):
"""Measure expression trigger round-trip latency."""
print(f"\nMeasuring latency ({iterations} iterations)...\n")
times = []
for i in range(iterations):
expr = "thinking" if i % 2 == 0 else "idle"
t0 = time.monotonic()
_post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
times.append(dt)
print(f" {i+1:2d}. {expr:10s}{dt:.1f}ms")
avg = sum(times) / len(times)
mn = min(times)
mx = max(times)
print(f"\n Avg: {avg:.1f}ms Min: {mn:.1f}ms Max: {mx:.1f}ms")
if avg < 100:
print(" PASS: Average latency under 100ms target")
else:
print(" WARNING: Average latency exceeds 100ms target")
# Return to idle
_post("/expression", {"event": "idle"})
def main():
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge Tester")
parser.add_argument("--auth", action="store_true", help="Run auth flow")
parser.add_argument("--lipsync", action="store_true", help="Test lip sync parameter sweep")
parser.add_argument("--latency", action="store_true", help="Measure round-trip latency")
parser.add_argument("--delay", type=float, default=2.0, help="Delay between expressions (default: 2s)")
parser.add_argument("--all", action="store_true", help="Run all tests")
args = parser.parse_args()
print("VTube Studio Expression Bridge Tester")
print("=" * 42)
status = check_bridge()
if args.auth:
run_auth()
print()
status = check_bridge()
if not status.get("authenticated") and not args.auth:
print("\nNot authenticated — skipping expression tests.")
print("Run with --auth to authenticate, or start VTube Studio first.")
return
if args.all:
test_expressions(args.delay)
test_lipsync()
test_latency()
elif args.lipsync:
test_lipsync()
elif args.latency:
test_latency()
else:
test_expressions(args.delay)
if __name__ == "__main__":
main()

View File

@@ -1,17 +1,16 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# homeai-visual/setup.sh — P7: VTube Studio bridge + Live2D expressions # homeai-visual/setup.sh — P7: VTube Studio Expression Bridge
# #
# Components: # Sets up:
# - vtube_studio.py — WebSocket client skill for OpenClaw # - Python venv with websockets
# - lipsync.py — amplitude-based lip sync # - vtube-bridge daemon (HTTP ↔ WebSocket bridge)
# - auth.py — VTube Studio token management # - vtube-ctl CLI (symlinked to PATH)
# - launchd service
# #
# Prerequisites: # Prerequisites:
# - P4 (homeai-agent) — OpenClaw running # - P4 (homeai-agent) — OpenClaw running
# - P5 (homeai-character) — aria.json with live2d_expressions set # - P5 (homeai-character) — aria.json with live2d_expressions set
# - macOS: VTube Studio installed (Mac App Store) # - VTube Studio installed (Mac App Store) with WebSocket API enabled
# - Linux: N/A — VTube Studio is macOS/Windows/iOS only
# Linux dev can test the skill code but not the VTube Studio side
set -euo pipefail set -euo pipefail
@@ -19,42 +18,61 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh" source "${REPO_DIR}/scripts/common.sh"
log_section "P7: VTube Studio Bridge" VENV_DIR="$HOME/homeai-visual-env"
detect_platform PLIST_SRC="${SCRIPT_DIR}/launchd/com.homeai.vtube-bridge.plist"
PLIST_DST="$HOME/Library/LaunchAgents/com.homeai.vtube-bridge.plist"
VTUBE_CTL_SRC="$HOME/.openclaw/skills/vtube-studio/scripts/vtube-ctl"
if [[ "$OS_TYPE" == "linux" ]]; then log_section "P7: VTube Studio Expression Bridge"
log_warn "VTube Studio is not available on Linux."
log_warn "This sub-project requires macOS (Mac Mini)." # ─── Python venv ──────────────────────────────────────────────────────────────
if [[ ! -d "$VENV_DIR" ]]; then
log_info "Creating Python venv at $VENV_DIR..."
python3 -m venv "$VENV_DIR"
fi fi
# ─── TODO: Implementation ────────────────────────────────────────────────────── log_info "Installing dependencies..."
cat <<'EOF' "$VENV_DIR/bin/pip" install --upgrade pip -q
"$VENV_DIR/bin/pip" install websockets -q
log_ok "Python venv ready ($(${VENV_DIR}/bin/python3 --version))"
─────────────────────────────────────────────────────────────────┐ # ─── vtube-ctl symlink ───────────────────────────────────────────────────────
│ P7: homeai-visual — NOT YET IMPLEMENTED │
│ │
│ macOS only (VTube Studio is macOS/iOS/Windows) │
│ │
│ Implementation steps: │
│ 1. Install VTube Studio from Mac App Store │
│ 2. Enable WebSocket API in VTube Studio (Settings → port 8001) │
│ 3. Source/purchase Live2D model │
│ 4. Create expression hotkeys for 8 states │
│ 5. Implement skills/vtube_studio.py (WebSocket client) │
│ 6. Implement skills/lipsync.py (amplitude → MouthOpen param) │
│ 7. Implement skills/auth.py (token request + persistence) │
│ 8. Register vtube_studio skill with OpenClaw │
│ 9. Update aria.json live2d_expressions with hotkey IDs │
│ 10. Test all 8 expression states │
│ │
│ On Linux: implement Python skills, test WebSocket protocol │
│ with a mock server before connecting to real VTube Studio. │
│ │
│ Interface contracts: │
│ VTUBE_WS_URL=ws://localhost:8001 │
└─────────────────────────────────────────────────────────────────┘
EOF if [[ -f "$VTUBE_CTL_SRC" ]]; then
chmod +x "$VTUBE_CTL_SRC"
ln -sf "$VTUBE_CTL_SRC" /opt/homebrew/bin/vtube-ctl
log_ok "vtube-ctl symlinked to /opt/homebrew/bin/vtube-ctl"
else
log_warn "vtube-ctl not found at $VTUBE_CTL_SRC — skipping symlink"
fi
log_info "P7 is not yet implemented. See homeai-visual/PLAN.md for details." # ─── launchd service ─────────────────────────────────────────────────────────
exit 0
if [[ -f "$PLIST_SRC" ]]; then
# Unload if already loaded
launchctl bootout "gui/$(id -u)/com.homeai.vtube-bridge" 2>/dev/null || true
cp "$PLIST_SRC" "$PLIST_DST"
launchctl bootstrap "gui/$(id -u)" "$PLIST_DST"
log_ok "launchd service loaded: com.homeai.vtube-bridge"
else
log_warn "Plist not found at $PLIST_SRC — skipping launchd setup"
fi
# ─── Status ──────────────────────────────────────────────────────────────────
echo ""
log_info "VTube Bridge setup complete."
log_info ""
log_info "Next steps:"
log_info " 1. Install VTube Studio from Mac App Store"
log_info " 2. Enable WebSocket API: Settings > WebSocket API > port 8001"
log_info " 3. Load a Live2D model"
log_info " 4. Create expression hotkeys (idle, listening, thinking, speaking, happy, sad, surprised, error)"
log_info " 5. Run: vtube-ctl auth (click Allow in VTube Studio)"
log_info " 6. Run: python3 ${SCRIPT_DIR}/scripts/test-expressions.py --all"
log_info " 7. Update aria.json with real hotkey UUIDs"
log_info ""
log_info "Logs: /tmp/homeai-vtube-bridge.log"
log_info "Bridge: http://localhost:8002/status"

View File

@@ -0,0 +1,454 @@
#!/usr/bin/env python3
"""
VTube Studio Expression Bridge — persistent WebSocket ↔ HTTP bridge.
Maintains a long-lived WebSocket connection to VTube Studio and exposes
a simple HTTP API so other HomeAI components can trigger expressions and
inject parameters (lip sync) without managing their own WS connections.
HTTP API (port 8002):
POST /expression {"event": "thinking"} → trigger hotkey
POST /parameter {"name": "MouthOpen", "value": 0.5} → inject param
POST /parameters [{"name": "MouthOpen", "value": 0.5}, ...]
POST /auth {} → request new token
GET /status → connection info
GET /expressions → list available expressions
Requires: pip install websockets
"""
import argparse
import asyncio
import json
import logging
import signal
import sys
import time
from http import HTTPStatus
from pathlib import Path
try:
import websockets
from websockets.exceptions import ConnectionClosed
except ImportError:
print("ERROR: 'websockets' package required. Install with: pip install websockets", file=sys.stderr)
sys.exit(1)
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
DEFAULT_VTUBE_WS_URL = "ws://localhost:8001"
DEFAULT_HTTP_PORT = 8002
TOKEN_PATH = Path.home() / ".openclaw" / "vtube_token.json"
DEFAULT_CHARACTER_PATH = (
Path.home() / "gitea" / "homeai" / "homeai-dashboard" / "characters" / "aria.json"
)
logger = logging.getLogger("vtube-bridge")
# ---------------------------------------------------------------------------
# VTube Studio WebSocket Client
# ---------------------------------------------------------------------------
class VTubeClient:
"""Persistent async WebSocket client for VTube Studio API."""
def __init__(self, ws_url: str, character_path: Path):
self.ws_url = ws_url
self.character_path = character_path
self._ws = None
self._token: str | None = None
self._authenticated = False
self._current_expression: str | None = None
self._connected = False
self._request_id = 0
self._lock = asyncio.Lock()
self._load_token()
self._load_character()
# ── Character config ──────────────────────────────────────────────
def _load_character(self):
"""Load expression mappings from character JSON."""
self.expression_map: dict[str, str] = {}
self.ws_triggers: dict = {}
try:
if self.character_path.exists():
cfg = json.loads(self.character_path.read_text())
self.expression_map = cfg.get("live2d_expressions", {})
self.ws_triggers = cfg.get("vtube_ws_triggers", {})
logger.info("Loaded %d expressions from %s", len(self.expression_map), self.character_path.name)
else:
logger.warning("Character file not found: %s", self.character_path)
except Exception as e:
logger.error("Failed to load character config: %s", e)
def reload_character(self):
"""Hot-reload character config without restarting."""
self._load_character()
return {"expressions": self.expression_map, "triggers": self.ws_triggers}
# ── Token persistence ─────────────────────────────────────────────
def _load_token(self):
try:
if TOKEN_PATH.exists():
data = json.loads(TOKEN_PATH.read_text())
self._token = data.get("token")
logger.info("Loaded auth token from %s", TOKEN_PATH)
except Exception as e:
logger.warning("Could not load token: %s", e)
def _save_token(self, token: str):
TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
TOKEN_PATH.write_text(json.dumps({"token": token}, indent=2))
self._token = token
logger.info("Saved auth token to %s", TOKEN_PATH)
# ── WebSocket comms ───────────────────────────────────────────────
def _next_id(self) -> str:
self._request_id += 1
return f"homeai-{self._request_id}"
async def _send(self, message_type: str, data: dict | None = None) -> dict:
"""Send a VTube Studio API message and return the response."""
payload = {
"apiName": "VTubeStudioPublicAPI",
"apiVersion": "1.0",
"requestID": self._next_id(),
"messageType": message_type,
"data": data or {},
}
await self._ws.send(json.dumps(payload))
resp = json.loads(await asyncio.wait_for(self._ws.recv(), timeout=10))
return resp
# ── Connection lifecycle ──────────────────────────────────────────
async def connect(self):
"""Connect and authenticate to VTube Studio."""
try:
self._ws = await websockets.connect(self.ws_url, ping_interval=20, ping_timeout=10)
self._connected = True
logger.info("Connected to VTube Studio at %s", self.ws_url)
if self._token:
await self._authenticate()
else:
logger.warning("No auth token — call POST /auth to initiate authentication")
except Exception as e:
self._connected = False
self._authenticated = False
logger.error("Connection failed: %s", e)
raise
async def _authenticate(self):
"""Authenticate with an existing token."""
resp = await self._send("AuthenticationRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"authenticationToken": self._token,
})
self._authenticated = resp.get("data", {}).get("authenticated", False)
if self._authenticated:
logger.info("Authenticated successfully")
else:
logger.warning("Token rejected — request a new one via POST /auth")
self._authenticated = False
async def request_new_token(self) -> dict:
"""Request a new auth token. User must click Allow in VTube Studio."""
if not self._connected:
return {"error": "Not connected to VTube Studio"}
resp = await self._send("AuthenticationTokenRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"pluginIcon": None,
})
token = resp.get("data", {}).get("authenticationToken")
if token:
self._save_token(token)
await self._authenticate()
return {"authenticated": self._authenticated, "token_saved": True}
return {"error": "No token received", "response": resp}
async def disconnect(self):
if self._ws:
await self._ws.close()
self._connected = False
self._authenticated = False
async def ensure_connected(self):
"""Reconnect if the connection dropped."""
if not self._connected or self._ws is None or self._ws.closed:
logger.info("Reconnecting...")
await self.connect()
# ── Expression & parameter API ────────────────────────────────────
async def trigger_expression(self, event: str) -> dict:
"""Trigger a named expression from the character config."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
hotkey_id = self.expression_map.get(event)
if not hotkey_id:
return {"error": f"Unknown expression: {event}", "available": list(self.expression_map.keys())}
resp = await self._send("HotkeyTriggerRequest", {"hotkeyID": hotkey_id})
self._current_expression = event
return {"ok": True, "expression": event, "hotkey_id": hotkey_id}
async def set_parameter(self, name: str, value: float, weight: float = 1.0) -> dict:
"""Inject a single VTube Studio parameter value."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": [{"id": name, "value": value, "weight": weight}],
})
return {"ok": True, "name": name, "value": value}
async def set_parameters(self, params: list[dict]) -> dict:
"""Inject multiple VTube Studio parameters at once."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
param_values = [
{"id": p["name"], "value": p["value"], "weight": p.get("weight", 1.0)}
for p in params
]
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": param_values,
})
return {"ok": True, "count": len(param_values)}
async def list_hotkeys(self) -> dict:
"""List all hotkeys available in the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("HotkeysInCurrentModelRequest", {})
return resp.get("data", {})
async def list_parameters(self) -> dict:
"""List all input parameters for the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InputParameterListRequest", {})
return resp.get("data", {})
def status(self) -> dict:
return {
"connected": self._connected,
"authenticated": self._authenticated,
"ws_url": self.ws_url,
"current_expression": self._current_expression,
"expression_count": len(self.expression_map),
"expressions": list(self.expression_map.keys()),
}
# ---------------------------------------------------------------------------
# HTTP Server (asyncio-based, no external deps)
# ---------------------------------------------------------------------------
class BridgeHTTPHandler:
"""Simple async HTTP request handler for the bridge API."""
def __init__(self, client: VTubeClient):
self.client = client
async def handle(self, reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
try:
request_line = await asyncio.wait_for(reader.readline(), timeout=5)
if not request_line:
writer.close()
return
method, path, _ = request_line.decode().strip().split(" ", 2)
path = path.split("?")[0] # strip query params
# Read headers
content_length = 0
while True:
line = await reader.readline()
if line == b"\r\n" or not line:
break
if line.lower().startswith(b"content-length:"):
content_length = int(line.split(b":")[1].strip())
# Read body
body = None
if content_length > 0:
body = await reader.read(content_length)
# Route
try:
result = await self._route(method, path, body)
await self._respond(writer, 200, result)
except Exception as e:
logger.error("Handler error: %s", e, exc_info=True)
await self._respond(writer, 500, {"error": str(e)})
except asyncio.TimeoutError:
writer.close()
except Exception as e:
logger.error("Connection error: %s", e)
try:
writer.close()
except Exception:
pass
async def _route(self, method: str, path: str, body: bytes | None) -> dict:
data = {}
if body:
try:
data = json.loads(body)
except json.JSONDecodeError:
return {"error": "Invalid JSON"}
if method == "GET" and path == "/status":
return self.client.status()
if method == "GET" and path == "/expressions":
return {
"expressions": self.client.expression_map,
"triggers": self.client.ws_triggers,
}
if method == "GET" and path == "/hotkeys":
return await self.client.list_hotkeys()
if method == "GET" and path == "/parameters":
return await self.client.list_parameters()
if method == "POST" and path == "/expression":
event = data.get("event")
if not event:
return {"error": "Missing 'event' field"}
return await self.client.trigger_expression(event)
if method == "POST" and path == "/parameter":
name = data.get("name")
value = data.get("value")
if name is None or value is None:
return {"error": "Missing 'name' or 'value' field"}
return await self.client.set_parameter(name, float(value), float(data.get("weight", 1.0)))
if method == "POST" and path == "/parameters":
if not isinstance(data, list):
return {"error": "Expected JSON array of {name, value} objects"}
return await self.client.set_parameters(data)
if method == "POST" and path == "/auth":
return await self.client.request_new_token()
if method == "POST" and path == "/reload":
return self.client.reload_character()
return {"error": f"Unknown route: {method} {path}"}
async def _respond(self, writer: asyncio.StreamWriter, status: int, data: dict):
body = json.dumps(data, indent=2).encode()
status_text = HTTPStatus(status).phrase
header = (
f"HTTP/1.1 {status} {status_text}\r\n"
f"Content-Type: application/json\r\n"
f"Content-Length: {len(body)}\r\n"
f"Access-Control-Allow-Origin: *\r\n"
f"Access-Control-Allow-Methods: GET, POST, OPTIONS\r\n"
f"Access-Control-Allow-Headers: Content-Type\r\n"
f"\r\n"
)
writer.write(header.encode() + body)
await writer.drain()
writer.close()
# ---------------------------------------------------------------------------
# Auto-reconnect loop
# ---------------------------------------------------------------------------
async def reconnect_loop(client: VTubeClient, interval: float = 5.0):
"""Background task that keeps the VTube Studio connection alive."""
while True:
try:
if not client._connected or client._ws is None or client._ws.closed:
logger.info("Connection lost — attempting reconnect...")
await client.connect()
except Exception as e:
logger.debug("Reconnect failed: %s (retrying in %.0fs)", e, interval)
await asyncio.sleep(interval)
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
async def main(args):
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
)
character_path = Path(args.character)
client = VTubeClient(args.vtube_url, character_path)
# Try initial connection (don't fail if VTube Studio isn't running yet)
try:
await client.connect()
except Exception as e:
logger.warning("Initial connection failed: %s (will keep retrying)", e)
# Start reconnect loop
reconnect_task = asyncio.create_task(reconnect_loop(client, interval=5.0))
# Start HTTP server
handler = BridgeHTTPHandler(client)
server = await asyncio.start_server(handler.handle, "0.0.0.0", args.port)
logger.info("HTTP API listening on http://0.0.0.0:%d", args.port)
logger.info("Endpoints: /status /expression /parameter /parameters /auth /reload /hotkeys")
# Graceful shutdown
stop = asyncio.Event()
def _signal_handler():
logger.info("Shutting down...")
stop.set()
loop = asyncio.get_event_loop()
for sig in (signal.SIGINT, signal.SIGTERM):
loop.add_signal_handler(sig, _signal_handler)
async with server:
await stop.wait()
reconnect_task.cancel()
await client.disconnect()
logger.info("Goodbye.")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge")
parser.add_argument("--port", type=int, default=DEFAULT_HTTP_PORT, help="HTTP API port (default: 8002)")
parser.add_argument("--vtube-url", default=DEFAULT_VTUBE_WS_URL, help="VTube Studio WebSocket URL")
parser.add_argument("--character", default=str(DEFAULT_CHARACTER_PATH), help="Path to character JSON")
parser.add_argument("--verbose", "-v", action="store_true", help="Debug logging")
args = parser.parse_args()
asyncio.run(main(args))

View File

@@ -18,6 +18,12 @@
<string>1.0</string> <string>1.0</string>
</array> </array>
<key>EnvironmentVariables</key>
<dict>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>
<true/> <true/>

View File

@@ -7,8 +7,10 @@ Usage:
import argparse import argparse
import asyncio import asyncio
import json
import logging import logging
import os import os
import urllib.request
import numpy as np import numpy as np
@@ -20,10 +22,76 @@ from wyoming.tts import Synthesize
_LOGGER = logging.getLogger(__name__) _LOGGER = logging.getLogger(__name__)
ACTIVE_TTS_VOICE_PATH = os.path.expanduser("~/homeai-data/active-tts-voice.json")
SAMPLE_RATE = 24000 SAMPLE_RATE = 24000
SAMPLE_WIDTH = 2 # int16 SAMPLE_WIDTH = 2 # int16
CHANNELS = 1 CHANNELS = 1
CHUNK_SECONDS = 1 # stream in 1-second chunks CHUNK_SECONDS = 1 # stream in 1-second chunks
VTUBE_BRIDGE_URL = "http://localhost:8002"
LIPSYNC_ENABLED = True
LIPSYNC_FRAME_SAMPLES = 1200 # 50ms frames at 24kHz → 20 updates/sec
LIPSYNC_SCALE = 10.0 # amplitude multiplier (tuned for Kokoro output levels)
def _send_lipsync(value: float):
"""Fire-and-forget POST to vtube-bridge with mouth open value."""
try:
body = json.dumps({"name": "MouthOpen", "value": value}).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}/parameter",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=0.5)
except Exception:
pass # bridge may not be running
def _compute_lipsync_frames(samples_int16: np.ndarray) -> list[float]:
"""Compute per-frame RMS amplitude scaled to 01 for lip sync."""
frames = []
for i in range(0, len(samples_int16), LIPSYNC_FRAME_SAMPLES):
frame = samples_int16[i : i + LIPSYNC_FRAME_SAMPLES].astype(np.float32)
rms = np.sqrt(np.mean(frame ** 2)) / 32768.0
mouth = min(rms * LIPSYNC_SCALE, 1.0)
frames.append(round(mouth, 3))
return frames
def _get_active_tts_config() -> dict | None:
"""Read the active TTS config set by the OpenClaw bridge."""
try:
with open(ACTIVE_TTS_VOICE_PATH) as f:
return json.load(f)
except Exception:
return None
def _synthesize_elevenlabs(text: str, voice_id: str, model: str = "eleven_multilingual_v2") -> bytes:
"""Call ElevenLabs TTS API and return raw PCM audio bytes (24kHz 16-bit mono)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set")
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}?output_format=pcm_24000"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
return resp.read()
def _load_kokoro(): def _load_kokoro():
@@ -76,26 +144,53 @@ class KokoroEventHandler(AsyncEventHandler):
synthesize = Synthesize.from_event(event) synthesize = Synthesize.from_event(event)
text = synthesize.text text = synthesize.text
voice = self._default_voice voice = self._default_voice
use_elevenlabs = False
if synthesize.voice and synthesize.voice.name: # Bridge state file takes priority (set per-request by OpenClaw bridge)
tts_config = _get_active_tts_config()
if tts_config and tts_config.get("engine") == "elevenlabs":
use_elevenlabs = True
voice = tts_config.get("elevenlabs_voice_id", "")
_LOGGER.debug("Synthesizing %r with ElevenLabs voice=%s", text, voice)
elif tts_config and tts_config.get("kokoro_voice"):
voice = tts_config["kokoro_voice"]
elif synthesize.voice and synthesize.voice.name:
voice = synthesize.voice.name voice = synthesize.voice.name
_LOGGER.debug("Synthesizing %r with voice=%s speed=%.1f", text, voice, self._speed)
try: try:
loop = asyncio.get_event_loop() loop = asyncio.get_event_loop()
samples, sample_rate = await loop.run_in_executor(
None, lambda: self._tts.create(text, voice=voice, speed=self._speed)
)
samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16) if use_elevenlabs and voice:
audio_bytes = samples_int16.tobytes() # ElevenLabs returns PCM 24kHz 16-bit mono
model = tts_config.get("elevenlabs_model", "eleven_multilingual_v2")
_LOGGER.info("Using ElevenLabs TTS (model=%s, voice=%s)", model, voice)
pcm_bytes = await loop.run_in_executor(
None, lambda: _synthesize_elevenlabs(text, voice, model)
)
samples_int16 = np.frombuffer(pcm_bytes, dtype=np.int16)
audio_bytes = pcm_bytes
else:
_LOGGER.debug("Synthesizing %r with Kokoro voice=%s speed=%.1f", text, voice, self._speed)
samples, sample_rate = await loop.run_in_executor(
None, lambda: self._tts.create(text, voice=voice, speed=self._speed)
)
samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16)
audio_bytes = samples_int16.tobytes()
# Pre-compute lip sync frames for the entire utterance
lipsync_frames = []
if LIPSYNC_ENABLED:
lipsync_frames = _compute_lipsync_frames(samples_int16)
await self.write_event( await self.write_event(
AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event() AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event()
) )
chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS
lipsync_idx = 0
samples_per_chunk = SAMPLE_RATE * CHUNK_SECONDS
frames_per_chunk = samples_per_chunk // LIPSYNC_FRAME_SAMPLES
for i in range(0, len(audio_bytes), chunk_size): for i in range(0, len(audio_bytes), chunk_size):
await self.write_event( await self.write_event(
AudioChunk( AudioChunk(
@@ -106,8 +201,22 @@ class KokoroEventHandler(AsyncEventHandler):
).event() ).event()
) )
# Send lip sync frames for this audio chunk
if LIPSYNC_ENABLED and lipsync_frames:
chunk_frames = lipsync_frames[lipsync_idx : lipsync_idx + frames_per_chunk]
for mouth_val in chunk_frames:
await asyncio.get_event_loop().run_in_executor(
None, _send_lipsync, mouth_val
)
lipsync_idx += frames_per_chunk
# Close mouth after speech
if LIPSYNC_ENABLED:
await asyncio.get_event_loop().run_in_executor(None, _send_lipsync, 0.0)
await self.write_event(AudioStop().event()) await self.write_event(AudioStop().event())
_LOGGER.info("Synthesized %.1fs of audio", len(samples) / sample_rate) duration = len(samples_int16) / SAMPLE_RATE
_LOGGER.info("Synthesized %.1fs of audio (%d lipsync frames)", duration, len(lipsync_frames))
except Exception: except Exception:
_LOGGER.exception("Synthesis error") _LOGGER.exception("Synthesis error")

221
start.sh Executable file
View File

@@ -0,0 +1,221 @@
#!/usr/bin/env bash
# start.sh — Start all HomeAI services on LINDBLUM
#
# Usage:
# ./start.sh # start everything
# ./start.sh status # show what's running
# ./start.sh stop # stop everything
# ./start.sh restart # stop then start everything
set -euo pipefail
REPO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${REPO_DIR}/scripts/common.sh"
DOCKER_DIR="${HOME}/server/docker"
# ─── Service definitions ──────────────────────────────────────────────────────
LAUNCHD_SERVICES=(
com.homeai.ollama
com.homeai.wyoming-stt
com.homeai.wyoming-tts
com.homeai.wyoming-satellite
com.homeai.openclaw
com.homeai.openclaw-bridge
com.homeai.preload-models
com.homeai.dashboard
)
DOCKER_COMPOSE_DIRS=(
"${DOCKER_DIR}/open-webui"
"${DOCKER_DIR}/uptime-kuma"
"${DOCKER_DIR}/n8n"
"${DOCKER_DIR}/code-server"
)
# ─── Helpers ──────────────────────────────────────────────────────────────────
launchd_running() {
local label="$1"
local pid
pid=$(launchctl list "$label" 2>/dev/null | grep '"PID"' | sed 's/[^0-9]//g' || true)
[[ -n "$pid" && "$pid" -gt 0 ]] 2>/dev/null
}
start_launchd() {
local label="$1"
local plist="${HOME}/Library/LaunchAgents/${label}.plist"
if [[ ! -f "$plist" ]]; then
log_warn "Plist not found: $plist — skipping"
return
fi
if launchd_running "$label"; then
log_info "Already running: $label"
return
fi
log_step "Starting $label"
# bootstrap loads + starts the agent
launchctl bootstrap "gui/$(id -u)" "$plist" 2>/dev/null \
|| launchctl kickstart -k "gui/$(id -u)/$label" 2>/dev/null \
|| true
sleep 0.5
if launchd_running "$label"; then
log_success "$label started"
else
log_warn "$label may not have started — check logs"
fi
}
stop_launchd() {
local label="$1"
local plist="${HOME}/Library/LaunchAgents/${label}.plist"
if [[ ! -f "$plist" ]]; then
return
fi
if ! launchd_running "$label"; then
log_info "Already stopped: $label"
return
fi
log_step "Stopping $label"
launchctl bootout "gui/$(id -u)/$label" 2>/dev/null || true
log_success "$label stopped"
}
start_docker_services() {
ensure_docker_running
for dir in "${DOCKER_COMPOSE_DIRS[@]}"; do
if [[ ! -d "$dir" ]]; then
log_warn "Docker compose dir not found: $dir — skipping"
continue
fi
local name
name=$(basename "$dir")
log_step "Starting Docker stack: $name"
docker_compose -f "${dir}/docker-compose.yml" up -d 2>/dev/null \
|| docker_compose -f "${dir}/compose.yml" up -d 2>/dev/null \
|| log_warn "No compose file found in $dir"
done
}
stop_docker_services() {
for dir in "${DOCKER_COMPOSE_DIRS[@]}"; do
if [[ ! -d "$dir" ]]; then
continue
fi
local name
name=$(basename "$dir")
log_step "Stopping Docker stack: $name"
docker_compose -f "${dir}/docker-compose.yml" down 2>/dev/null \
|| docker_compose -f "${dir}/compose.yml" down 2>/dev/null \
|| true
done
}
# ─── Commands ─────────────────────────────────────────────────────────────────
do_start() {
log_section "Starting HomeAI Services"
echo ""
log_info "── LaunchD services ──"
for svc in "${LAUNCHD_SERVICES[@]}"; do
start_launchd "$svc"
done
echo ""
log_info "── Docker services ──"
start_docker_services
echo ""
log_success "All services started."
echo ""
do_status
}
do_stop() {
log_section "Stopping HomeAI Services"
echo ""
log_info "── LaunchD services ──"
for svc in "${LAUNCHD_SERVICES[@]}"; do
stop_launchd "$svc"
done
echo ""
log_info "── Docker services ──"
stop_docker_services
echo ""
log_success "All services stopped."
}
do_status() {
log_section "HomeAI Service Status"
# LaunchD services
printf "\n ${BOLD}%-34s %-10s %s${RESET}\n" "SERVICE" "STATUS" "PID"
printf " %-34s %-10s %s\n" "─────────────────────────────────" "────────" "─────"
for svc in "${LAUNCHD_SERVICES[@]}"; do
local pid
pid=$(launchctl list "$svc" 2>/dev/null | grep '"PID"' | sed 's/[^0-9]//g' || true)
if [[ -n "$pid" && "$pid" -gt 0 ]] 2>/dev/null; then
printf " ${GREEN}${RESET} %-32s ${GREEN}%-10s${RESET} %s\n" "$svc" "running" "$pid"
else
printf " ${RED}${RESET} %-32s ${RED}%-10s${RESET} %s\n" "$svc" "stopped" "-"
fi
done
# Docker services
echo ""
local containers=("homeai-open-webui" "homeai-uptime-kuma" "homeai-n8n" "homeai-code-server")
for c in "${containers[@]}"; do
local state
state=$(docker inspect -f '{{.State.Status}}' "$c" 2>/dev/null || echo "not found")
if [[ "$state" == "running" ]]; then
printf " ${GREEN}${RESET} %-32s ${GREEN}%-10s${RESET}\n" "$c" "$state"
else
printf " ${RED}${RESET} %-32s ${RED}%-10s${RESET}\n" "$c" "$state"
fi
done
echo ""
}
# ─── Main ─────────────────────────────────────────────────────────────────────
main() {
local cmd="${1:-start}"
case "$cmd" in
start) do_start ;;
stop) do_stop ;;
restart) do_stop; echo ""; do_start ;;
status) do_status ;;
-h|--help|help)
echo ""
echo " Usage: $0 [start|stop|restart|status]"
echo ""
echo " Commands:"
echo " start Start all HomeAI services (default)"
echo " stop Stop all HomeAI services"
echo " restart Stop then start all services"
echo " status Show current service status"
echo ""
;;
*) die "Unknown command: $cmd. Run '$0 help' for usage." ;;
esac
}
main "$@"