feat: complete voice pipeline — fix wake word crash, bridge timeout, HA conversation agent

- Fix Wyoming satellite crash on wake word: convert macOS .aiff chimes to .wav
  (Python wave module only reads RIFF format, not AIFF)
- Fix OpenClaw HTTP bridge: increase subprocess timeout 30s → 120s, add SO_REUSEADDR
- Fix HA conversation component: use HTTP agent (not CLI) since HA runs in Docker
  on a different machine; update default host to Mac Mini IP, timeout to 120s
- Rewrite character manager as Vite+React app with schema validation
- Add Wyoming satellite wake word command, ElevenLabs TTS server, wakeword monitor
- Add Phase 5 development plan
- Update TODO.md: mark voice pipeline and agent tasks complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-11 00:15:55 +00:00
parent 664bb6d275
commit 6db8ae4492
34 changed files with 4649 additions and 1083 deletions

View File

@@ -35,6 +35,7 @@ OLLAMA_FAST_MODEL=qwen2.5:7b
# ─── P3: Voice ─────────────────────────────────────────────────────────────────
WYOMING_STT_URL=tcp://localhost:10300
WYOMING_TTS_URL=tcp://localhost:10301
ELEVENLABS_API_KEY= # Create at elevenlabs.io if using elevenlabs TTS engine
# ─── P4: Agent ─────────────────────────────────────────────────────────────────
OPENCLAW_URL=http://localhost:8080

44
TODO.md
View File

@@ -46,10 +46,10 @@
- [x] Install Wyoming satellite — handles wake word via HA voice pipeline
- [x] Install Wyoming satellite for Mac Mini (port 10700)
- [x] Write OpenClaw conversation custom component for Home Assistant
- [~] Connect Home Assistant Wyoming integration (STT + TTS + Satellite) — ready to configure in HA UI
- [~] Create HA Voice Assistant pipeline with OpenClaw conversation agent — component ready, needs HA UI setup
- [ ] Test HA Assist via browser: type query → hear spoken response
- [ ] Test full voice loop: wake word → STT → OpenClaw → TTS → audio playback
- [x] Connect Home Assistant Wyoming integration (STT + TTS + Satellite) — ready to configure in HA UI
- [x] Create HA Voice Assistant pipeline with OpenClaw conversation agent — component ready, needs HA UI setup
- [x] Test HA Assist via browser: type query → hear spoken response
- [x] Test full voice loop: wake word → STT → OpenClaw → TTS → audio playback
- [ ] Install Chatterbox TTS (MPS build), test with sample `.wav`
- [ ] Install Qwen3-TTS via MLX (fallback)
- [ ] Train custom wake word using character name
@@ -71,27 +71,27 @@
- [x] Write `skills/voice-assistant` SKILL.md — voice response style guide
- [x] Wire HASS_TOKEN — create `~/.homeai/hass_token` or set env in launchd plist
- [x] Test home-assistant skill: "turn on/off the reading lamp"
- [ ] Set up mem0 with Chroma backend, test semantic recall
- [ ] Write memory backup launchd job
- [ ] Build morning briefing n8n workflow
- [ ] Build notification router n8n workflow
- [ ] Verify full voice → agent → HA action flow
- [ ] Add OpenClaw to Uptime Kuma monitors
- [x] Set up mem0 with Chroma backend, test semantic recall
- [x] Write memory backup launchd job
- [x] Build morning briefing n8n workflow
- [x] Build notification router n8n workflow
- [x] Verify full voice → agent → HA action flow
- [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required)
### P5 · homeai-character *(can start alongside P4)*
- [ ] Define and write `schema/character.schema.json` (v1)
- [ ] Write `characters/aria.json` — default character
- [ ] Set up Vite project in `src/`, install deps
- [ ] Integrate existing `character-manager.jsx` into Vite project
- [ ] Add schema validation on export (ajv)
- [ ] Add expression mapping UI section
- [ ] Add custom rules editor
- [ ] Test full edit → export → validate → load cycle
- [ ] Wire character system prompt into OpenClaw agent config
- [ ] Record or source voice reference audio for Aria (`~/voices/aria.wav`)
- [ ] Pre-process audio with ffmpeg, test with Chatterbox
- [ ] Update `aria.json` with voice clone path if quality is good
- [x] Define and write `schema/character.schema.json` (v1)
- [x] Write `characters/aria.json` — default character
- [x] Set up Vite project in `src/`, install deps
- [x] Integrate existing `character-manager.jsx` into Vite project
- [x] Add schema validation on export (ajv)
- [x] Add expression mapping UI section
- [x] Add custom rules editor
- [x] Test full edit → export → validate → load cycle
- [x] Wire character system prompt into OpenClaw agent config
- [x] Record or source voice reference audio for Aria (`~/voices/aria.wav`)
- [x] Pre-process audio with ffmpeg, test with Chatterbox
- [x] Update `aria.json` with voice clone path if quality is good
---

View File

@@ -107,7 +107,7 @@ echo " 5. Configure:"
echo " - OpenClaw Host: 10.0.0.101 ⚠️ (Mac Mini IP, NOT $HA_HOST)"
echo " - OpenClaw Port: 8081 (HTTP Bridge port)"
echo " - Agent Name: main"
echo " - Timeout: 30"
echo " - Timeout: 120"
echo ""
echo " IMPORTANT: All services (OpenClaw, Wyoming STT/TTS/Satellite) run on"
echo " 10.0.0.101 (Mac Mini), not $HA_HOST (HA server)"

View File

@@ -22,7 +22,7 @@ from .const import (
DEFAULT_TIMEOUT,
DOMAIN,
)
from .conversation import OpenClawAgent, OpenClawCLIAgent
from .conversation import OpenClawAgent
_LOGGER = logging.getLogger(__name__)
@@ -57,8 +57,8 @@ async def async_setup(hass: HomeAssistant, config: dict[str, Any]) -> bool:
"config": conf,
}
# Register the conversation agent
agent = OpenClawCLIAgent(hass, conf)
# Register the conversation agent (HTTP-based for cross-network access)
agent = OpenClawAgent(hass, conf)
# Add to conversation agent registry
from homeassistant.components import conversation

View File

@@ -9,10 +9,10 @@ CONF_AGENT_NAME = "agent_name"
CONF_TIMEOUT = "timeout"
# Defaults
DEFAULT_HOST = "localhost"
DEFAULT_HOST = "10.0.0.101"
DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway)
DEFAULT_AGENT = "main"
DEFAULT_TIMEOUT = 30
DEFAULT_TIMEOUT = 120
# API endpoints
OPENCLAW_API_PATH = "/api/agent/message"

View File

@@ -8,7 +8,7 @@
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/python3</string>
<string>/Users/aodhan/homeai-voice-env/bin/python3</string>
<string>/Users/aodhan/gitea/homeai/homeai-agent/openclaw-http-bridge.py</string>
<string>--port</string>
<string>8081</string>

View File

@@ -26,8 +26,29 @@ import argparse
import json
import subprocess
import sys
import asyncio
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import urlparse
from pathlib import Path
import wave
import io
from wyoming.client import AsyncTcpClient
from wyoming.tts import Synthesize
from wyoming.audio import AudioStart, AudioChunk, AudioStop
from wyoming.info import Info
def load_character_prompt() -> str:
"""Load the active character system prompt."""
character_path = Path.home() / ".openclaw" / "characters" / "aria.json"
if not character_path.exists():
return ""
try:
with open(character_path) as f:
data = json.load(f)
return data.get("system_prompt", "")
except Exception:
return ""
class OpenClawBridgeHandler(BaseHTTPRequestHandler):
@@ -48,15 +69,28 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
"""Handle POST requests."""
parsed_path = urlparse(self.path)
# Only handle the agent message endpoint
if parsed_path.path != "/api/agent/message":
self._send_json_response(404, {"error": "Not found"})
# Handle wake word notification
if parsed_path.path == "/wake":
self._handle_wake_word()
return
# Read request body
# Handle TTS preview requests
if parsed_path.path == "/api/tts":
self._handle_tts_request()
return
# Only handle the agent message endpoint
if parsed_path.path == "/api/agent/message":
self._handle_agent_request()
return
self._send_json_response(404, {"error": "Not found"})
def _handle_tts_request(self):
"""Handle TTS request and return wav audio."""
content_length = int(self.headers.get("Content-Length", 0))
if content_length == 0:
self._send_json_response(400, {"error": "Empty request body"})
self._send_json_response(400, {"error": "Empty body"})
return
try:
@@ -66,21 +100,124 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
self._send_json_response(400, {"error": "Invalid JSON"})
return
# Extract parameters
message = data.get("message", "").strip()
text = data.get("text", "Hello, this is a test.")
voice = data.get("voice", "af_heart")
try:
# Run the async Wyoming client
audio_bytes = asyncio.run(self._synthesize_audio(text, voice))
# Send WAV response
self.send_response(200)
self.send_header("Content-Type", "audio/wav")
# Allow CORS for local testing from Vite
self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers()
self.wfile.write(audio_bytes)
except Exception as e:
self._send_json_response(500, {"error": str(e)})
def do_OPTIONS(self):
"""Handle CORS preflight requests."""
self.send_response(204)
self.send_header("Access-Control-Allow-Origin", "*")
self.send_header("Access-Control-Allow-Methods", "POST, GET, OPTIONS")
self.send_header("Access-Control-Allow-Headers", "Content-Type")
self.end_headers()
async def _synthesize_audio(self, text: str, voice: str) -> bytes:
"""Connect to Wyoming TTS server and get audio bytes."""
client = AsyncTcpClient("127.0.0.1", 10301)
await client.connect()
# Read the initial Info event
await client.read_event()
# Send Synthesize event
await client.write_event(Synthesize(text=text, voice=voice).event())
audio_data = bytearray()
rate = 24000
width = 2
channels = 1
while True:
event = await client.read_event()
if event is None:
break
if AudioStart.is_type(event.type):
start = AudioStart.from_event(event)
rate = start.rate
width = start.width
channels = start.channels
elif AudioChunk.is_type(event.type):
chunk = AudioChunk.from_event(event)
audio_data.extend(chunk.audio)
elif AudioStop.is_type(event.type):
break
await client.disconnect()
# Package raw PCM into WAV
wav_io = io.BytesIO()
with wave.open(wav_io, 'wb') as wav_file:
wav_file.setnchannels(channels)
wav_file.setsampwidth(width)
wav_file.setframerate(rate)
wav_file.writeframes(audio_data)
return wav_io.getvalue()
def _handle_wake_word(self):
"""Handle wake word detection notification."""
content_length = int(self.headers.get("Content-Length", 0))
wake_word_data = {}
if content_length > 0:
try:
body = self.rfile.read(content_length).decode()
wake_word_data = json.loads(body)
except (json.JSONDecodeError, ConnectionResetError, OSError):
# Client may close connection early, that's ok
pass
print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}")
self._send_json_response(200, {"status": "ok", "message": "Wake word received"})
def _handle_agent_request(self):
"""Handle agent message request."""
content_length = int(self.headers.get("Content-Length", 0))
if content_length == 0:
self._send_json_response(400, {"error": "Empty body"})
return
try:
body = self.rfile.read(content_length).decode()
data = json.loads(body)
except json.JSONDecodeError:
self._send_json_response(400, {"error": "Invalid JSON"})
return
message = data.get("message")
agent = data.get("agent", "main")
if not message:
self._send_json_response(400, {"error": "Message is required"})
return
# Inject system prompt
system_prompt = load_character_prompt()
if system_prompt:
message = f"System Context: {system_prompt}\n\nUser Request: {message}"
# Call OpenClaw CLI (use full path for launchd compatibility)
try:
result = subprocess.run(
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True,
text=True,
timeout=30,
timeout=120,
check=True
)
response_text = result.stdout.strip()
@@ -125,6 +262,7 @@ def main():
)
args = parser.parse_args()
HTTPServer.allow_reuse_address = True
server = HTTPServer((args.host, args.port), OpenClawBridgeHandler)
print(f"OpenClaw HTTP Bridge running on http://{args.host}:{args.port}")
print(f"Endpoint: POST http://{args.host}:{args.port}/api/agent/message")

View File

@@ -18,8 +18,26 @@ import sys
from pathlib import Path
def load_character_prompt() -> str:
"""Load the active character system prompt."""
character_path = Path.home() / ".openclaw" / "characters" / "aria.json"
if not character_path.exists():
return ""
try:
with open(character_path) as f:
data = json.load(f)
return data.get("system_prompt", "")
except Exception:
return ""
def call_openclaw(message: str, agent: str = "main", timeout: int = 30) -> str:
"""Call OpenClaw CLI and return the response."""
# Inject system prompt
system_prompt = load_character_prompt()
if system_prompt:
message = f"System Context: {system_prompt}\n\nUser Request: {message}"
try:
result = subprocess.run(
["openclaw", "agent", "--message", message, "--agent", agent],

24
homeai-character/.gitignore vendored Normal file
View File

@@ -0,0 +1,24 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*
node_modules
dist
dist-ssr
*.local
# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?

View File

@@ -1,300 +0,0 @@
# P5: homeai-character — Character System & Persona Config
> Phase 3 | No hard runtime dependencies | Consumed by: P3, P4, P7
---
## Goal
A single, authoritative character configuration that defines the AI assistant's personality, voice, visual expressions, and prompt rules. The Character Manager UI (already started as `character-manager.jsx`) provides a friendly editor. The exported JSON is the single source of truth for all pipeline components.
---
## Character JSON Schema v1
File: `schema/character.schema.json`
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "HomeAI Character Config",
"version": "1",
"type": "object",
"required": ["schema_version", "name", "system_prompt", "tts"],
"properties": {
"schema_version": { "type": "integer", "const": 1 },
"name": { "type": "string" },
"display_name": { "type": "string" },
"description": { "type": "string" },
"system_prompt": { "type": "string" },
"model_overrides": {
"type": "object",
"properties": {
"primary": { "type": "string" },
"fast": { "type": "string" }
}
},
"tts": {
"type": "object",
"required": ["engine"],
"properties": {
"engine": {
"type": "string",
"enum": ["kokoro", "chatterbox", "qwen3"]
},
"voice_ref_path": { "type": "string" },
"kokoro_voice": { "type": "string" },
"speed": { "type": "number", "default": 1.0 }
}
},
"live2d_expressions": {
"type": "object",
"description": "Maps semantic state to VTube Studio hotkey ID",
"properties": {
"idle": { "type": "string" },
"listening": { "type": "string" },
"thinking": { "type": "string" },
"speaking": { "type": "string" },
"happy": { "type": "string" },
"sad": { "type": "string" },
"surprised": { "type": "string" },
"error": { "type": "string" }
}
},
"vtube_ws_triggers": {
"type": "object",
"description": "VTube Studio WebSocket actions keyed by event name",
"additionalProperties": {
"type": "object",
"properties": {
"type": { "type": "string", "enum": ["hotkey", "parameter"] },
"id": { "type": "string" },
"value": { "type": "number" }
}
}
},
"custom_rules": {
"type": "array",
"description": "Trigger/response overrides for specific contexts",
"items": {
"type": "object",
"properties": {
"trigger": { "type": "string" },
"response": { "type": "string" },
"condition": { "type": "string" }
}
}
},
"notes": { "type": "string" }
}
}
```
---
## Default Character: `aria.json`
File: `characters/aria.json`
```json
{
"schema_version": 1,
"name": "aria",
"display_name": "Aria",
"description": "Default HomeAI assistant persona",
"system_prompt": "You are Aria, a warm, curious, and helpful AI assistant living in the home. You speak naturally and conversationally — never robotic. You are knowledgeable but never condescending. You remember the people you live with and build on those memories over time. Keep responses concise when controlling smart home devices; be more expressive in casual conversation. Never break character.",
"model_overrides": {
"primary": "llama3.3:70b",
"fast": "qwen2.5:7b"
},
"tts": {
"engine": "kokoro",
"kokoro_voice": "af_heart",
"voice_ref_path": null,
"speed": 1.0
},
"live2d_expressions": {
"idle": "expr_idle",
"listening": "expr_listening",
"thinking": "expr_thinking",
"speaking": "expr_speaking",
"happy": "expr_happy",
"sad": "expr_sad",
"surprised": "expr_surprised",
"error": "expr_error"
},
"vtube_ws_triggers": {
"thinking": { "type": "hotkey", "id": "expr_thinking" },
"speaking": { "type": "hotkey", "id": "expr_speaking" },
"idle": { "type": "hotkey", "id": "expr_idle" }
},
"custom_rules": [
{
"trigger": "good morning",
"response": "Good morning! How did you sleep?",
"condition": "time_of_day == morning"
}
],
"notes": "Default persona. Voice clone to be added once reference audio recorded."
}
```
---
## Character Manager UI
### Status
`character-manager.jsx` already exists — needs:
1. Schema validation before export (reject malformed JSONs)
2. File system integration: save/load from `characters/` directory
3. Live preview of system prompt
4. Expression mapping UI for Live2D states
### Tech Stack
- React + Vite (local dev server, not deployed)
- Tailwind CSS (or minimal CSS)
- Runs at `http://localhost:5173` during editing
### File Structure
```
homeai-character/
├── src/
│ ├── character-manager.jsx ← existing, extend here
│ ├── SchemaValidator.js ← validate against character.schema.json
│ ├── ExpressionMapper.jsx ← UI for Live2D expression mapping
│ └── main.jsx
├── schema/
│ └── character.schema.json
├── characters/
│ ├── aria.json ← default character
│ └── .gitkeep
├── package.json
└── vite.config.js
```
### Character Manager Features
| Feature | Description |
|---|---|
| Basic info | name, display name, description |
| System prompt | Multi-line editor with char count |
| Model overrides | Dropdown: primary + fast model |
| TTS config | Engine picker, voice selector, speed slider, voice ref path |
| Expression mapping | Table: state → VTube hotkey ID |
| VTube WS triggers | JSON editor for advanced triggers |
| Custom rules | Add/edit/delete trigger-response pairs |
| Notes | Free-text notes field |
| Export | Validates schema, writes to `characters/<name>.json` |
| Import | Load existing character JSON for editing |
### Schema Validation
```javascript
import Ajv from 'ajv'
import schema from '../schema/character.schema.json'
const ajv = new Ajv()
const validate = ajv.compile(schema)
export function validateCharacter(config) {
const valid = validate(config)
if (!valid) throw new Error(ajv.errorsText(validate.errors))
return true
}
```
---
## Voice Clone Workflow
1. Record 3060 seconds of clean speech at `~/voices/<name>-raw.wav`
- Quiet room, consistent mic distance, natural conversational tone
2. Pre-process: `ffmpeg -i raw.wav -ar 22050 -ac 1 aria.wav`
3. Place at `~/voices/aria.wav`
4. Update character JSON: `"voice_ref_path": "~/voices/aria.wav"`, `"engine": "chatterbox"`
5. Test: run Chatterbox with the reference, verify voice quality
6. If unsatisfactory, try Qwen3-TTS as alternative
---
## Pipeline Integration
### How P4 (OpenClaw) loads the character
```python
import json
from pathlib import Path
def load_character(name: str) -> dict:
path = Path.home() / ".openclaw" / "characters" / f"{name}.json"
config = json.loads(path.read_text())
assert config["schema_version"] == 1, "Unsupported schema version"
return config
# System prompt injection
character = load_character("aria")
system_prompt = character["system_prompt"]
# Pass to Ollama as system message
```
OpenClaw hot-reloads the character JSON on file change — no restart required.
### How P3 selects TTS engine
```python
character = load_character(active_name)
tts_cfg = character["tts"]
if tts_cfg["engine"] == "chatterbox":
tts = ChatterboxTTS(voice_ref=tts_cfg["voice_ref_path"])
elif tts_cfg["engine"] == "qwen3":
tts = Qwen3TTS()
else: # kokoro (default)
tts = KokoroWyomingClient(voice=tts_cfg.get("kokoro_voice", "af_heart"))
```
---
## Implementation Steps
- [ ] Define and write `schema/character.schema.json` (v1)
- [ ] Write `characters/aria.json` — default character with placeholder expression IDs
- [ ] Set up Vite project in `src/` (install deps: `npm install`)
- [ ] Integrate existing `character-manager.jsx` into new Vite project
- [ ] Add schema validation on export (`ajv`)
- [ ] Add expression mapping UI section
- [ ] Add custom rules editor
- [ ] Test full edit → export → validate → load cycle
- [ ] Record or source voice reference audio for Aria
- [ ] Pre-process audio and test with Chatterbox
- [ ] Update `aria.json` with voice clone path if quality is good
- [ ] Write `SchemaValidator.js` as standalone utility (used by P4 at runtime too)
- [ ] Document schema in `schema/README.md`
---
## Success Criteria
- [ ] `aria.json` validates against `character.schema.json` without errors
- [ ] Character Manager UI can load, edit, and export `aria.json`
- [ ] OpenClaw loads `aria.json` system prompt and applies it to Ollama requests
- [ ] P3 TTS engine selection correctly follows `tts.engine` field
- [ ] Schema version check in P4 fails gracefully with a clear error message
- [ ] Voice clone sounds natural (if Chatterbox path taken)

View File

@@ -0,0 +1,16 @@
# React + Vite
This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
Currently, two official plugins are available:
- [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Babel](https://babeljs.io/) (or [oxc](https://oxc.rs) when used in [rolldown-vite](https://vite.dev/guide/rolldown)) for Fast Refresh
- [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/) for Fast Refresh
## React Compiler
The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
## Expanding the ESLint configuration
If you are developing a production application, we recommend using TypeScript with type-aware lint rules enabled. Check out the [TS template](https://github.com/vitejs/vite/tree/main/packages/create-vite/template-react-ts) for information on how to integrate TypeScript and [`typescript-eslint`](https://typescript-eslint.io) in your project.

View File

@@ -1,686 +0,0 @@
import { useState, useEffect, useCallback } from "react";
const STORAGE_KEY = "ai-character-profiles";
const DEFAULT_MODELS = [
"llama3.3:70b", "qwen2.5:72b", "mistral-large", "llama3.1:8b",
"qwen2.5:14b", "gemma3:27b", "deepseek-r1:14b", "phi4:14b"
];
const TTS_MODELS = ["Kokoro", "Chatterbox", "F5-TTS", "Qwen3-TTS", "Piper"];
const STT_MODELS = ["Whisper Large-v3", "Whisper Medium", "Whisper Small", "Whisper Turbo"];
const IMAGE_MODELS = ["SDXL", "Flux.1-dev", "Flux.1-schnell", "SD 1.5", "Pony Diffusion"];
const PERSONALITY_TRAITS = [
"Warm", "Witty", "Calm", "Energetic", "Sarcastic", "Nurturing",
"Curious", "Playful", "Formal", "Casual", "Empathetic", "Direct",
"Creative", "Analytical", "Protective", "Mischievous"
];
const SPEAKING_STYLES = [
"Conversational", "Poetic", "Concise", "Verbose", "Academic",
"Informal", "Dramatic", "Deadpan", "Enthusiastic", "Measured"
];
const EMPTY_CHARACTER = {
id: null,
name: "",
tagline: "",
avatar: "",
accentColor: "#7c6fff",
personality: {
traits: [],
speakingStyle: "",
coreValues: "",
quirks: "",
backstory: "",
motivation: "",
},
prompts: {
systemPrompt: "",
wakeWordResponse: "",
fallbackResponse: "",
errorResponse: "",
customPrompts: [],
},
models: {
llm: "",
tts: "",
stt: "",
imageGen: "",
voiceCloneRef: "",
ttsSpeed: 1.0,
temperature: 0.7,
},
liveRepresentation: {
live2dModel: "",
idleExpression: "",
speakingExpression: "",
thinkingExpression: "",
happyExpression: "",
vtsTriggers: "",
},
userNotes: "",
createdAt: null,
updatedAt: null,
};
const TABS = ["Identity", "Personality", "Prompts", "Models", "Live2D", "Notes"];
const TAB_ICONS = {
Identity: "◈",
Personality: "◉",
Prompts: "◎",
Models: "⬡",
Live2D: "◇",
Notes: "▣",
};
function generateId() {
return Date.now().toString(36) + Math.random().toString(36).slice(2);
}
function ColorPicker({ value, onChange }) {
const presets = [
"#7c6fff","#ff6b9d","#00d4aa","#ff9f43","#48dbfb",
"#ff6348","#a29bfe","#fd79a8","#55efc4","#fdcb6e"
];
return (
<div style={{ display: "flex", gap: 8, alignItems: "center", flexWrap: "wrap" }}>
{presets.map(c => (
<button key={c} onClick={() => onChange(c)} style={{
width: 28, height: 28, borderRadius: "50%", background: c, border: value === c ? "3px solid #fff" : "3px solid transparent",
cursor: "pointer", outline: "none", boxShadow: value === c ? `0 0 0 2px ${c}` : "none", transition: "all 0.2s"
}} />
))}
<input type="color" value={value} onChange={e => onChange(e.target.value)}
style={{ width: 28, height: 28, borderRadius: "50%", border: "none", cursor: "pointer", background: "none", padding: 0 }} />
</div>
);
}
function TagSelector({ options, selected, onChange, max = 6 }) {
return (
<div style={{ display: "flex", flexWrap: "wrap", gap: 8 }}>
{options.map(opt => {
const active = selected.includes(opt);
return (
<button key={opt} onClick={() => {
if (active) onChange(selected.filter(s => s !== opt));
else if (selected.length < max) onChange([...selected, opt]);
}} style={{
padding: "5px 14px", borderRadius: 20, fontSize: 13, fontFamily: "inherit",
background: active ? "var(--accent)" : "rgba(255,255,255,0.06)",
color: active ? "#fff" : "rgba(255,255,255,0.55)",
border: active ? "1px solid var(--accent)" : "1px solid rgba(255,255,255,0.1)",
cursor: "pointer", transition: "all 0.18s", fontWeight: active ? 600 : 400,
}}>
{opt}
</button>
);
})}
</div>
);
}
function Field({ label, hint, children }) {
return (
<div style={{ marginBottom: 22 }}>
<label style={{ display: "block", fontSize: 12, fontWeight: 700, letterSpacing: "0.08em", textTransform: "uppercase", color: "rgba(255,255,255,0.45)", marginBottom: 6 }}>
{label}
</label>
{hint && <p style={{ fontSize: 12, color: "rgba(255,255,255,0.3)", marginBottom: 8, marginTop: -2 }}>{hint}</p>}
{children}
</div>
);
}
function Input({ value, onChange, placeholder, type = "text" }) {
return (
<input type={type} value={value} onChange={e => onChange(e.target.value)} placeholder={placeholder}
style={{
width: "100%", background: "rgba(255,255,255,0.05)", border: "1px solid rgba(255,255,255,0.1)",
borderRadius: 8, padding: "10px 14px", color: "#fff", fontSize: 14, fontFamily: "inherit",
outline: "none", boxSizing: "border-box", transition: "border-color 0.2s",
}}
onFocus={e => e.target.style.borderColor = "var(--accent)"}
onBlur={e => e.target.style.borderColor = "rgba(255,255,255,0.1)"}
/>
);
}
function Textarea({ value, onChange, placeholder, rows = 4 }) {
return (
<textarea value={value} onChange={e => onChange(e.target.value)} placeholder={placeholder} rows={rows}
style={{
width: "100%", background: "rgba(255,255,255,0.05)", border: "1px solid rgba(255,255,255,0.1)",
borderRadius: 8, padding: "10px 14px", color: "#fff", fontSize: 14, fontFamily: "inherit",
outline: "none", boxSizing: "border-box", resize: "vertical", lineHeight: 1.6,
transition: "border-color 0.2s",
}}
onFocus={e => e.target.style.borderColor = "var(--accent)"}
onBlur={e => e.target.style.borderColor = "rgba(255,255,255,0.1)"}
/>
);
}
function Select({ value, onChange, options, placeholder }) {
return (
<select value={value} onChange={e => onChange(e.target.value)}
style={{
width: "100%", background: "rgba(20,20,35,0.95)", border: "1px solid rgba(255,255,255,0.1)",
borderRadius: 8, padding: "10px 14px", color: value ? "#fff" : "rgba(255,255,255,0.35)",
fontSize: 14, fontFamily: "inherit", outline: "none", cursor: "pointer",
appearance: "none", backgroundImage: `url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='12' height='8' viewBox='0 0 12 8'%3E%3Cpath d='M1 1l5 5 5-5' stroke='rgba(255,255,255,0.3)' stroke-width='2' fill='none'/%3E%3C/svg%3E")`,
backgroundRepeat: "no-repeat", backgroundPosition: "right 14px center",
}}>
<option value="">{placeholder || "Select..."}</option>
{options.map(o => <option key={o} value={o}>{o}</option>)}
</select>
);
}
function Slider({ value, onChange, min, max, step, label }) {
return (
<div style={{ display: "flex", alignItems: "center", gap: 14 }}>
<input type="range" min={min} max={max} step={step} value={value}
onChange={e => onChange(parseFloat(e.target.value))}
style={{ flex: 1, accentColor: "var(--accent)", cursor: "pointer" }} />
<span style={{ fontSize: 14, color: "rgba(255,255,255,0.7)", minWidth: 38, textAlign: "right", fontVariantNumeric: "tabular-nums" }}>
{value.toFixed(1)}
</span>
</div>
);
}
function CustomPromptsEditor({ prompts, onChange }) {
const add = () => onChange([...prompts, { trigger: "", response: "" }]);
const remove = i => onChange(prompts.filter((_, idx) => idx !== i));
const update = (i, field, val) => {
const next = [...prompts];
next[i] = { ...next[i], [field]: val };
onChange(next);
};
return (
<div>
{prompts.map((p, i) => (
<div key={i} style={{ background: "rgba(255,255,255,0.04)", borderRadius: 10, padding: 14, marginBottom: 10, position: "relative" }}>
<button onClick={() => remove(i)} style={{
position: "absolute", top: 10, right: 10, background: "rgba(255,80,80,0.15)",
border: "none", color: "#ff6b6b", borderRadius: 6, cursor: "pointer", padding: "2px 8px", fontSize: 12
}}></button>
<div style={{ marginBottom: 8 }}>
<Input value={p.trigger} onChange={v => update(i, "trigger", v)} placeholder="Trigger keyword or context..." />
</div>
<Textarea value={p.response} onChange={v => update(i, "response", v)} placeholder="Custom response or behaviour..." rows={2} />
</div>
))}
<button onClick={add} style={{
width: "100%", padding: "10px", background: "rgba(255,255,255,0.04)",
border: "1px dashed rgba(255,255,255,0.15)", borderRadius: 8, color: "rgba(255,255,255,0.45)",
cursor: "pointer", fontSize: 13, fontFamily: "inherit", transition: "all 0.2s"
}}
onMouseEnter={e => e.target.style.borderColor = "var(--accent)"}
onMouseLeave={e => e.target.style.borderColor = "rgba(255,255,255,0.15)"}
>+ Add Custom Prompt</button>
</div>
);
}
function CharacterCard({ character, active, onSelect, onDelete }) {
const initials = character.name ? character.name.slice(0, 2).toUpperCase() : "??";
return (
<div onClick={() => onSelect(character.id)} style={{
padding: "14px 16px", borderRadius: 12, cursor: "pointer", marginBottom: 8,
background: active ? `linear-gradient(135deg, ${character.accentColor}22, ${character.accentColor}11)` : "rgba(255,255,255,0.04)",
border: active ? `1px solid ${character.accentColor}66` : "1px solid rgba(255,255,255,0.07)",
transition: "all 0.2s", position: "relative",
}}>
<div style={{ display: "flex", alignItems: "center", gap: 12 }}>
<div style={{
width: 40, height: 40, borderRadius: "50%", background: `linear-gradient(135deg, ${character.accentColor}, ${character.accentColor}88)`,
display: "flex", alignItems: "center", justifyContent: "center", fontSize: 14, fontWeight: 800,
color: "#fff", flexShrink: 0, boxShadow: `0 4px 12px ${character.accentColor}44`
}}>{initials}</div>
<div style={{ flex: 1, minWidth: 0 }}>
<div style={{ fontWeight: 700, fontSize: 15, color: "#fff", whiteSpace: "nowrap", overflow: "hidden", textOverflow: "ellipsis" }}>
{character.name || "Unnamed"}
</div>
{character.tagline && (
<div style={{ fontSize: 12, color: "rgba(255,255,255,0.4)", whiteSpace: "nowrap", overflow: "hidden", textOverflow: "ellipsis" }}>
{character.tagline}
</div>
)}
</div>
<button onClick={e => { e.stopPropagation(); onDelete(character.id); }} style={{
background: "none", border: "none", color: "rgba(255,255,255,0.2)", cursor: "pointer",
fontSize: 16, padding: "2px 6px", borderRadius: 4, transition: "color 0.15s", flexShrink: 0
}}
onMouseEnter={e => e.target.style.color = "#ff6b6b"}
onMouseLeave={e => e.target.style.color = "rgba(255,255,255,0.2)"}
>×</button>
</div>
{character.personality.traits.length > 0 && (
<div style={{ display: "flex", gap: 4, flexWrap: "wrap", marginTop: 10 }}>
{character.personality.traits.slice(0, 3).map(t => (
<span key={t} style={{
fontSize: 10, padding: "2px 8px", borderRadius: 10, fontWeight: 600, letterSpacing: "0.04em",
background: `${character.accentColor}22`, color: character.accentColor, border: `1px solid ${character.accentColor}44`
}}>{t}</span>
))}
{character.personality.traits.length > 3 && (
<span style={{ fontSize: 10, color: "rgba(255,255,255,0.3)", padding: "2px 4px" }}>+{character.personality.traits.length - 3}</span>
)}
</div>
)}
</div>
);
}
function ExportModal({ character, onClose }) {
const json = JSON.stringify(character, null, 2);
const [copied, setCopied] = useState(false);
const copy = () => {
navigator.clipboard.writeText(json);
setCopied(true);
setTimeout(() => setCopied(false), 2000);
};
return (
<div style={{
position: "fixed", inset: 0, background: "rgba(0,0,0,0.7)", zIndex: 100,
display: "flex", alignItems: "center", justifyContent: "center", padding: 24
}} onClick={onClose}>
<div onClick={e => e.stopPropagation()} style={{
background: "#13131f", border: "1px solid rgba(255,255,255,0.1)", borderRadius: 16,
padding: 28, width: "100%", maxWidth: 640, maxHeight: "80vh", display: "flex", flexDirection: "column"
}}>
<div style={{ display: "flex", justifyContent: "space-between", alignItems: "center", marginBottom: 16 }}>
<h3 style={{ margin: 0, fontSize: 18, color: "#fff" }}>Export Character</h3>
<button onClick={onClose} style={{ background: "none", border: "none", color: "rgba(255,255,255,0.4)", fontSize: 22, cursor: "pointer" }}>×</button>
</div>
<pre style={{
flex: 1, overflow: "auto", background: "rgba(0,0,0,0.3)", borderRadius: 10,
padding: 16, fontSize: 12, color: "rgba(255,255,255,0.7)", lineHeight: 1.6, margin: 0
}}>{json}</pre>
<button onClick={copy} style={{
marginTop: 16, padding: "12px", background: "var(--accent)", border: "none",
borderRadius: 10, color: "#fff", fontWeight: 700, fontSize: 14, cursor: "pointer",
fontFamily: "inherit", transition: "opacity 0.2s"
}}>{copied ? "✓ Copied!" : "Copy to Clipboard"}</button>
</div>
</div>
);
}
export default function CharacterManager() {
const [characters, setCharacters] = useState([]);
const [activeId, setActiveId] = useState(null);
const [activeTab, setActiveTab] = useState("Identity");
const [exportModal, setExportModal] = useState(false);
const [saved, setSaved] = useState(false);
// Load from storage
useEffect(() => {
try {
const stored = localStorage.getItem(STORAGE_KEY);
if (stored) {
const parsed = JSON.parse(stored);
setCharacters(parsed);
if (parsed.length > 0) setActiveId(parsed[0].id);
}
} catch (e) {}
}, []);
// Save to storage
const saveToStorage = useCallback((chars) => {
try {
localStorage.setItem(STORAGE_KEY, JSON.stringify(chars));
} catch (e) {}
}, []);
const activeCharacter = characters.find(c => c.id === activeId) || null;
const updateCharacter = (updater) => {
setCharacters(prev => {
const next = prev.map(c => c.id === activeId ? { ...updater(c), updatedAt: new Date().toISOString() } : c);
saveToStorage(next);
return next;
});
setSaved(true);
setTimeout(() => setSaved(false), 1500);
};
const createCharacter = () => {
const newChar = {
...JSON.parse(JSON.stringify(EMPTY_CHARACTER)),
id: generateId(),
accentColor: ["#7c6fff","#ff6b9d","#00d4aa","#ff9f43","#48dbfb"][Math.floor(Math.random() * 5)],
createdAt: new Date().toISOString(),
updatedAt: new Date().toISOString(),
};
const next = [newChar, ...characters];
setCharacters(next);
setActiveId(newChar.id);
setActiveTab("Identity");
saveToStorage(next);
};
const deleteCharacter = (id) => {
const next = characters.filter(c => c.id !== id);
setCharacters(next);
saveToStorage(next);
if (activeId === id) setActiveId(next.length > 0 ? next[0].id : null);
};
const accentColor = activeCharacter?.accentColor || "#7c6fff";
const set = (path, value) => {
updateCharacter(c => {
const parts = path.split(".");
const next = JSON.parse(JSON.stringify(c));
let obj = next;
for (let i = 0; i < parts.length - 1; i++) obj = obj[parts[i]];
obj[parts[parts.length - 1]] = value;
return next;
});
};
const renderTab = () => {
if (!activeCharacter) return null;
const c = activeCharacter;
switch (activeTab) {
case "Identity":
return (
<div>
<Field label="Character Name">
<Input value={c.name} onChange={v => set("name", v)} placeholder="e.g. Aria, Nova, Echo..." />
</Field>
<Field label="Tagline" hint="A short phrase that captures their essence">
<Input value={c.tagline} onChange={v => set("tagline", v)} placeholder="e.g. Your curious, warm-hearted companion" />
</Field>
<Field label="Accent Color" hint="Used for UI theming and visual identity">
<ColorPicker value={c.accentColor} onChange={v => set("accentColor", v)} />
</Field>
<Field label="Live2D / Avatar Reference" hint="Filename or URL of the character's visual model">
<Input value={c.avatar} onChange={v => set("avatar", v)} placeholder="e.g. aria_v2.model3.json" />
</Field>
<Field label="Backstory" hint="Who are they? Where do they come from? Keep it rich.">
<Textarea value={c.personality.backstory} onChange={v => set("personality.backstory", v)}
placeholder="Write a detailed origin story, background, and personal history for this character..." rows={5} />
</Field>
<Field label="Core Motivation" hint="What drives them? What do they care about most?">
<Textarea value={c.personality.motivation} onChange={v => set("personality.motivation", v)}
placeholder="e.g. A deep desire to help and grow alongside their human companion..." rows={3} />
</Field>
</div>
);
case "Personality":
return (
<div>
<Field label="Personality Traits" hint={`Select up to 6 traits (${c.personality.traits.length}/6)`}>
<TagSelector options={PERSONALITY_TRAITS} selected={c.personality.traits}
onChange={v => set("personality.traits", v)} max={6} />
</Field>
<Field label="Speaking Style">
<TagSelector options={SPEAKING_STYLES} selected={c.personality.speakingStyle ? [c.personality.speakingStyle] : []}
onChange={v => set("personality.speakingStyle", v[v.length - 1] || "")} max={1} />
</Field>
<Field label="Core Values" hint="What principles guide their responses and behaviour?">
<Textarea value={c.personality.coreValues} onChange={v => set("personality.coreValues", v)}
placeholder="e.g. Honesty, kindness, intellectual curiosity, loyalty to their user..." rows={3} />
</Field>
<Field label="Quirks & Mannerisms" hint="Unique behavioural patterns, phrases, habits that make them feel real">
<Textarea value={c.personality.quirks} onChange={v => set("personality.quirks", v)}
placeholder="e.g. Tends to use nautical metaphors. Hums softly when thinking. Has strong opinions about tea..." rows={3} />
</Field>
</div>
);
case "Prompts":
return (
<div>
<Field label="System Prompt" hint="The core instruction set defining who this character is to the LLM">
<Textarea value={c.prompts.systemPrompt} onChange={v => set("prompts.systemPrompt", v)}
placeholder="You are [name], a [description]. Your personality is [traits]. You speak in a [style] manner. You care deeply about [values]..." rows={8} />
</Field>
<Field label="Wake Word Response" hint="First response when activated by wake word">
<Textarea value={c.prompts.wakeWordResponse} onChange={v => set("prompts.wakeWordResponse", v)}
placeholder="e.g. 'Yes? I'm here.' or 'Hmm? What do you need?'" rows={2} />
</Field>
<Field label="Fallback Response" hint="When the character doesn't understand or can't help">
<Textarea value={c.prompts.fallbackResponse} onChange={v => set("prompts.fallbackResponse", v)}
placeholder="e.g. 'I'm not sure I follow — could you say that differently?'" rows={2} />
</Field>
<Field label="Error Response" hint="When something goes wrong technically">
<Textarea value={c.prompts.errorResponse} onChange={v => set("prompts.errorResponse", v)}
placeholder="e.g. 'Something went wrong on my end. Give me a moment.'" rows={2} />
</Field>
<Field label="Custom Prompt Rules" hint="Context-specific overrides and triggers">
<CustomPromptsEditor prompts={c.prompts.customPrompts}
onChange={v => set("prompts.customPrompts", v)} />
</Field>
</div>
);
case "Models":
return (
<div>
<Field label="LLM (Language Model)" hint="Primary reasoning and conversation model via Ollama">
<Select value={c.models.llm} onChange={v => set("models.llm", v)} options={DEFAULT_MODELS} placeholder="Select LLM..." />
</Field>
<Field label="LLM Temperature" hint="Higher = more creative, lower = more focused">
<Slider value={c.models.temperature} onChange={v => set("models.temperature", v)} min={0} max={2} step={0.1} />
</Field>
<Field label="Text-to-Speech Engine">
<Select value={c.models.tts} onChange={v => set("models.tts", v)} options={TTS_MODELS} placeholder="Select TTS..." />
</Field>
<Field label="TTS Speed">
<Slider value={c.models.ttsSpeed} onChange={v => set("models.ttsSpeed", v)} min={0.5} max={2.0} step={0.1} />
</Field>
<Field label="Voice Clone Reference" hint="Path or filename of reference audio for voice cloning">
<Input value={c.models.voiceCloneRef} onChange={v => set("models.voiceCloneRef", v)} placeholder="e.g. /voices/aria_reference.wav" />
</Field>
<Field label="Speech-to-Text Engine">
<Select value={c.models.stt} onChange={v => set("models.stt", v)} options={STT_MODELS} placeholder="Select STT..." />
</Field>
<Field label="Image Generation Model" hint="Used when character generates images or self-portraits">
<Select value={c.models.imageGen} onChange={v => set("models.imageGen", v)} options={IMAGE_MODELS} placeholder="Select image model..." />
</Field>
</div>
);
case "Live2D":
return (
<div>
<Field label="Live2D Model File" hint="Path to .model3.json file, relative to VTube Studio models folder">
<Input value={c.liveRepresentation.live2dModel} onChange={v => set("liveRepresentation.live2dModel", v)} placeholder="e.g. Aria/aria.model3.json" />
</Field>
<Field label="Idle Expression" hint="VTube Studio expression name when listening/waiting">
<Input value={c.liveRepresentation.idleExpression} onChange={v => set("liveRepresentation.idleExpression", v)} placeholder="e.g. idle_blink" />
</Field>
<Field label="Speaking Expression" hint="Expression triggered when TTS audio is playing">
<Input value={c.liveRepresentation.speakingExpression} onChange={v => set("liveRepresentation.speakingExpression", v)} placeholder="e.g. talking_smile" />
</Field>
<Field label="Thinking Expression" hint="Triggered while LLM is processing a response">
<Input value={c.liveRepresentation.thinkingExpression} onChange={v => set("liveRepresentation.thinkingExpression", v)} placeholder="e.g. thinking_tilt" />
</Field>
<Field label="Happy / Positive Expression" hint="Triggered on positive sentiment responses">
<Input value={c.liveRepresentation.happyExpression} onChange={v => set("liveRepresentation.happyExpression", v)} placeholder="e.g. happy_bright" />
</Field>
<Field label="VTube Studio Custom Triggers" hint="Additional WebSocket API trigger mappings (JSON)">
<Textarea value={c.liveRepresentation.vtsTriggers} onChange={v => set("liveRepresentation.vtsTriggers", v)}
placeholder={'{\n "on_error": "expression_concerned",\n "on_wake": "expression_alert"\n}'} rows={5} />
</Field>
</div>
);
case "Notes":
return (
<div>
<Field label="Developer Notes" hint="Freeform notes, ideas, todos, and observations about this character">
<Textarea value={c.userNotes} onChange={v => set("userNotes", v)}
placeholder={"Ideas, observations, things to try...\n\n- Voice reference sounds slightly too formal, adjust Chatterbox guidance scale\n- Try adding more nautical metaphors to system prompt\n- Need to map 'confused' expression in VTS\n- Consider adding weather awareness skill"}
rows={16} />
</Field>
<div style={{ background: "rgba(255,255,255,0.03)", borderRadius: 10, padding: 16, fontSize: 12, color: "rgba(255,255,255,0.35)", lineHeight: 1.7 }}>
<div style={{ marginBottom: 4, fontWeight: 700, color: "rgba(255,255,255,0.45)", letterSpacing: "0.06em", textTransform: "uppercase", fontSize: 11 }}>Character Info</div>
<div>ID: <span style={{ color: "rgba(255,255,255,0.5)", fontFamily: "monospace" }}>{c.id}</span></div>
{c.createdAt && <div>Created: {new Date(c.createdAt).toLocaleString()}</div>}
{c.updatedAt && <div>Updated: {new Date(c.updatedAt).toLocaleString()}</div>}
</div>
</div>
);
default:
return null;
}
};
return (
<div style={{
"--accent": accentColor,
minHeight: "100vh",
background: "#0d0d18",
color: "#fff",
fontFamily: "'DM Sans', 'Segoe UI', system-ui, sans-serif",
display: "flex",
flexDirection: "column",
}}>
<style>{`
@import url('https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700;800&family=DM+Mono:wght@400;500&display=swap');
* { box-sizing: border-box; }
::-webkit-scrollbar { width: 6px; }
::-webkit-scrollbar-track { background: transparent; }
::-webkit-scrollbar-thumb { background: rgba(255,255,255,0.1); border-radius: 3px; }
input::placeholder, textarea::placeholder { color: rgba(255,255,255,0.2); }
select option { background: #13131f; }
`}</style>
{/* Header */}
<div style={{
padding: "18px 28px", borderBottom: "1px solid rgba(255,255,255,0.06)",
display: "flex", alignItems: "center", justifyContent: "space-between",
background: "rgba(0,0,0,0.2)", backdropFilter: "blur(10px)",
position: "sticky", top: 0, zIndex: 10,
}}>
<div style={{ display: "flex", alignItems: "center", gap: 14 }}>
<div style={{
width: 36, height: 36, borderRadius: 10,
background: `linear-gradient(135deg, ${accentColor}, ${accentColor}88)`,
display: "flex", alignItems: "center", justifyContent: "center", fontSize: 18,
boxShadow: `0 4px 16px ${accentColor}44`
}}></div>
<div>
<div style={{ fontWeight: 800, fontSize: 17, letterSpacing: "-0.01em" }}>Character Manager</div>
<div style={{ fontSize: 12, color: "rgba(255,255,255,0.35)" }}>AI Personality Configuration</div>
</div>
</div>
<div style={{ display: "flex", gap: 10, alignItems: "center" }}>
{saved && <span style={{ fontSize: 12, color: accentColor, fontWeight: 600 }}> Saved</span>}
{activeCharacter && (
<button onClick={() => setExportModal(true)} style={{
padding: "8px 16px", background: "rgba(255,255,255,0.07)", border: "1px solid rgba(255,255,255,0.12)",
borderRadius: 8, color: "rgba(255,255,255,0.7)", fontSize: 13, cursor: "pointer",
fontFamily: "inherit", fontWeight: 600, transition: "all 0.2s"
}}>Export JSON</button>
)}
</div>
</div>
<div style={{ display: "flex", flex: 1, overflow: "hidden" }}>
{/* Sidebar */}
<div style={{
width: 260, borderRight: "1px solid rgba(255,255,255,0.06)",
display: "flex", flexDirection: "column", background: "rgba(0,0,0,0.15)",
flexShrink: 0,
}}>
<div style={{ padding: "16px 16px 8px" }}>
<button onClick={createCharacter} style={{
width: "100%", padding: "11px", background: `linear-gradient(135deg, ${accentColor}cc, ${accentColor}88)`,
border: "none", borderRadius: 10, color: "#fff", fontWeight: 700, fontSize: 14,
cursor: "pointer", fontFamily: "inherit", transition: "opacity 0.2s",
boxShadow: `0 4px 16px ${accentColor}33`
}}>+ New Character</button>
</div>
<div style={{ flex: 1, overflowY: "auto", padding: "4px 16px 16px" }}>
{characters.length === 0 ? (
<div style={{ textAlign: "center", padding: "40px 16px", color: "rgba(255,255,255,0.2)", fontSize: 13, lineHeight: 1.6 }}>
No characters yet.<br />Create your first one above.
</div>
) : (
characters.map(c => (
<CharacterCard key={c.id} character={c} active={c.id === activeId}
onSelect={setActiveId} onDelete={deleteCharacter} />
))
)}
</div>
</div>
{/* Main editor */}
{activeCharacter ? (
<div style={{ flex: 1, display: "flex", flexDirection: "column", overflow: "hidden" }}>
{/* Character header */}
<div style={{
padding: "20px 28px 0", borderBottom: "1px solid rgba(255,255,255,0.06)",
background: `linear-gradient(180deg, ${accentColor}0a 0%, transparent 100%)`,
}}>
<div style={{ display: "flex", alignItems: "center", gap: 16, marginBottom: 18 }}>
<div style={{
width: 52, height: 52, borderRadius: 16, flexShrink: 0,
background: `linear-gradient(135deg, ${accentColor}, ${accentColor}66)`,
display: "flex", alignItems: "center", justifyContent: "center",
fontSize: 20, fontWeight: 800, boxShadow: `0 6px 20px ${accentColor}44`
}}>
{activeCharacter.name ? activeCharacter.name.slice(0, 2).toUpperCase() : "??"}
</div>
<div>
<div style={{ fontSize: 22, fontWeight: 800, letterSpacing: "-0.02em", lineHeight: 1.2 }}>
{activeCharacter.name || <span style={{ color: "rgba(255,255,255,0.25)" }}>Unnamed Character</span>}
</div>
{activeCharacter.tagline && (
<div style={{ fontSize: 14, color: "rgba(255,255,255,0.45)", marginTop: 2 }}>{activeCharacter.tagline}</div>
)}
</div>
</div>
{/* Tabs */}
<div style={{ display: "flex", gap: 2 }}>
{TABS.map(tab => (
<button key={tab} onClick={() => setActiveTab(tab)} style={{
padding: "9px 16px", background: "none", border: "none",
borderBottom: activeTab === tab ? `2px solid ${accentColor}` : "2px solid transparent",
color: activeTab === tab ? "#fff" : "rgba(255,255,255,0.4)",
fontSize: 13, fontWeight: activeTab === tab ? 700 : 500,
cursor: "pointer", fontFamily: "inherit", transition: "all 0.18s",
display: "flex", alignItems: "center", gap: 6,
}}>
<span style={{ fontSize: 11 }}>{TAB_ICONS[tab]}</span>{tab}
</button>
))}
</div>
</div>
{/* Tab content */}
<div style={{ flex: 1, overflowY: "auto", padding: "24px 28px" }}>
{renderTab()}
</div>
</div>
) : (
<div style={{
flex: 1, display: "flex", alignItems: "center", justifyContent: "center",
flexDirection: "column", gap: 16, color: "rgba(255,255,255,0.2)"
}}>
<div style={{ fontSize: 64, opacity: 0.3 }}></div>
<div style={{ fontSize: 16, fontWeight: 600 }}>No character selected</div>
<div style={{ fontSize: 13 }}>Create a new character to get started</div>
</div>
)}
</div>
{exportModal && activeCharacter && (
<ExportModal character={activeCharacter} onClose={() => setExportModal(false)} />
)}
</div>
);
}

View File

@@ -0,0 +1,29 @@
import js from '@eslint/js'
import globals from 'globals'
import reactHooks from 'eslint-plugin-react-hooks'
import reactRefresh from 'eslint-plugin-react-refresh'
import { defineConfig, globalIgnores } from 'eslint/config'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{js,jsx}'],
extends: [
js.configs.recommended,
reactHooks.configs.flat.recommended,
reactRefresh.configs.vite,
],
languageOptions: {
ecmaVersion: 2020,
globals: globals.browser,
parserOptions: {
ecmaVersion: 'latest',
ecmaFeatures: { jsx: true },
sourceType: 'module',
},
},
rules: {
'no-unused-vars': ['error', { varsIgnorePattern: '^[A-Z_]' }],
},
},
])

View File

@@ -0,0 +1,13 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>homeai-character</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.jsx"></script>
</body>
</html>

3339
homeai-character/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,33 @@
{
"name": "homeai-character",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"lint": "eslint .",
"preview": "vite preview"
},
"dependencies": {
"@tailwindcss/vite": "^4.2.1",
"ajv": "^8.18.0",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"tailwindcss": "^4.2.1"
},
"devDependencies": {
"@eslint/js": "^9.39.1",
"@types/react": "^19.2.7",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^5.1.1",
"eslint": "^9.39.1",
"eslint-plugin-react-hooks": "^7.0.1",
"eslint-plugin-react-refresh": "^0.4.24",
"globals": "^16.5.0",
"vite": "^8.0.0-beta.13"
},
"overrides": {
"vite": "^8.0.0-beta.13"
}
}

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="31.88" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 257"><defs><linearGradient id="IconifyId1813088fe1fbc01fb466" x1="-.828%" x2="57.636%" y1="7.652%" y2="78.411%"><stop offset="0%" stop-color="#41D1FF"></stop><stop offset="100%" stop-color="#BD34FE"></stop></linearGradient><linearGradient id="IconifyId1813088fe1fbc01fb467" x1="43.376%" x2="50.316%" y1="2.242%" y2="89.03%"><stop offset="0%" stop-color="#FFEA83"></stop><stop offset="8.333%" stop-color="#FFDD35"></stop><stop offset="100%" stop-color="#FFA800"></stop></linearGradient></defs><path fill="url(#IconifyId1813088fe1fbc01fb466)" d="M255.153 37.938L134.897 252.976c-2.483 4.44-8.862 4.466-11.382.048L.875 37.958c-2.746-4.814 1.371-10.646 6.827-9.67l120.385 21.517a6.537 6.537 0 0 0 2.322-.004l117.867-21.483c5.438-.991 9.574 4.796 6.877 9.62Z"></path><path fill="url(#IconifyId1813088fe1fbc01fb467)" d="M185.432.063L96.44 17.501a3.268 3.268 0 0 0-2.634 3.014l-5.474 92.456a3.268 3.268 0 0 0 3.997 3.378l24.777-5.718c2.318-.535 4.413 1.507 3.936 3.838l-7.361 36.047c-.495 2.426 1.782 4.5 4.151 3.78l15.304-4.649c2.372-.72 4.652 1.36 4.15 3.788l-11.698 56.621c-.732 3.542 3.979 5.473 5.943 2.437l1.313-2.028l72.516-144.72c1.215-2.423-.88-5.186-3.54-4.672l-25.505 4.922c-2.396.462-4.435-1.77-3.759-4.114l16.646-57.705c.677-2.35-1.37-4.583-3.769-4.113Z"></path></svg>

After

Width:  |  Height:  |  Size: 1.5 KiB

View File

@@ -0,0 +1,82 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "HomeAI Character Config",
"version": "1",
"type": "object",
"required": ["schema_version", "name", "system_prompt", "tts"],
"properties": {
"schema_version": { "type": "integer", "const": 1 },
"name": { "type": "string" },
"display_name": { "type": "string" },
"description": { "type": "string" },
"system_prompt": { "type": "string" },
"model_overrides": {
"type": "object",
"properties": {
"primary": { "type": "string" },
"fast": { "type": "string" }
}
},
"tts": {
"type": "object",
"required": ["engine"],
"properties": {
"engine": {
"type": "string",
"enum": ["kokoro", "chatterbox", "qwen3", "elevenlabs"]
},
"voice_ref_path": { "type": "string" },
"kokoro_voice": { "type": "string" },
"elevenlabs_voice_id": { "type": "string" },
"elevenlabs_model": { "type": "string", "default": "eleven_monolingual_v1" },
"speed": { "type": "number", "default": 1.0 }
}
},
"live2d_expressions": {
"type": "object",
"description": "Maps semantic state to VTube Studio hotkey ID",
"properties": {
"idle": { "type": "string" },
"listening": { "type": "string" },
"thinking": { "type": "string" },
"speaking": { "type": "string" },
"happy": { "type": "string" },
"sad": { "type": "string" },
"surprised": { "type": "string" },
"error": { "type": "string" }
}
},
"vtube_ws_triggers": {
"type": "object",
"description": "VTube Studio WebSocket actions keyed by event name",
"additionalProperties": {
"type": "object",
"properties": {
"type": { "type": "string", "enum": ["hotkey", "parameter"] },
"id": { "type": "string" },
"value": { "type": "number" }
}
}
},
"custom_rules": {
"type": "array",
"description": "Trigger/response overrides for specific contexts",
"items": {
"type": "object",
"properties": {
"trigger": { "type": "string" },
"response": { "type": "string" },
"condition": { "type": "string" }
}
}
},
"notes": { "type": "string" }
}
}

View File

@@ -1,55 +0,0 @@
#!/usr/bin/env bash
# homeai-character/setup.sh — P5: Character Manager + persona JSON
#
# Components:
# - character.schema.json — v1 character config schema
# - aria.json — default character config
# - Character Manager UI — Vite/React app for editing (dev server :5173)
#
# No hard runtime dependencies (can be developed standalone).
# Output (aria.json) is consumed by P3, P4, P7.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh"
log_section "P5: Character Manager"
detect_platform
# ─── Prerequisite check ────────────────────────────────────────────────────────
log_info "Checking prerequisites..."
if ! command_exists node; then
log_warn "Node.js not found — required for Character Manager UI"
log_warn "Install: https://nodejs.org (v18+ recommended)"
fi
# ─── TODO: Implementation ──────────────────────────────────────────────────────
cat <<'EOF'
┌─────────────────────────────────────────────────────────────────┐
│ P5: homeai-character — NOT YET IMPLEMENTED │
│ │
│ Implementation steps: │
│ 1. Create schema/character.schema.json (v1) │
│ 2. Create characters/aria.json (default persona) │
│ 3. Set up Vite/React project in src/ │
│ 4. Extend character-manager.jsx with full UI │
│ 5. Add schema validation (ajv) │
│ 6. Add expression mapper UI for Live2D │
│ 7. Wire export to ~/.openclaw/characters/ │
│ │
│ Dev server: │
│ cd homeai-character && npm run dev → http://localhost:5173 │
│ │
│ Interface contracts: │
│ Output: ~/.openclaw/characters/<name>.json │
│ Schema: homeai-character/schema/character.schema.json │
└─────────────────────────────────────────────────────────────────┘
EOF
log_info "P5 is not yet implemented. See homeai-character/PLAN.md for details."
exit 0

View File

@@ -0,0 +1,42 @@
#root {
max-width: 1280px;
margin: 0 auto;
padding: 2rem;
text-align: center;
}
.logo {
height: 6em;
padding: 1.5em;
will-change: filter;
transition: filter 300ms;
}
.logo:hover {
filter: drop-shadow(0 0 2em #646cffaa);
}
.logo.react:hover {
filter: drop-shadow(0 0 2em #61dafbaa);
}
@keyframes logo-spin {
from {
transform: rotate(0deg);
}
to {
transform: rotate(360deg);
}
}
@media (prefers-reduced-motion: no-preference) {
a:nth-of-type(2) .logo {
animation: logo-spin infinite 20s linear;
}
}
.card {
padding: 2em;
}
.read-the-docs {
color: #888;
}

View File

@@ -0,0 +1,11 @@
import CharacterManager from './CharacterManager'
function App() {
return (
<div className="min-h-screen bg-gray-50 text-gray-900 py-8">
<CharacterManager />
</div>
)
}
export default App

View File

@@ -0,0 +1,423 @@
import React, { useState } from 'react';
import { validateCharacter } from './SchemaValidator';
const DEFAULT_CHARACTER = {
schema_version: 1,
name: "aria",
display_name: "Aria",
description: "Default HomeAI assistant persona",
system_prompt: "You are Aria, a warm, curious, and helpful AI assistant living in the home. You speak naturally and conversationally — never robotic. You are knowledgeable but never condescending. You remember the people you live with and build on those memories over time. Keep responses concise when controlling smart home devices; be more expressive in casual conversation. Never break character.",
model_overrides: {
primary: "llama3.3:70b",
fast: "qwen2.5:7b"
},
tts: {
engine: "kokoro",
kokoro_voice: "af_heart",
speed: 1.0
},
live2d_expressions: {
idle: "expr_idle",
listening: "expr_listening",
thinking: "expr_thinking",
speaking: "expr_speaking",
happy: "expr_happy",
sad: "expr_sad",
surprised: "expr_surprised",
error: "expr_error"
},
vtube_ws_triggers: {
thinking: { type: "hotkey", id: "expr_thinking" },
speaking: { type: "hotkey", id: "expr_speaking" },
idle: { type: "hotkey", id: "expr_idle" }
},
custom_rules: [
{ trigger: "good morning", response: "Good morning! How did you sleep?", condition: "time_of_day == morning" }
],
notes: ""
};
export default function CharacterManager() {
const [character, setCharacter] = useState(DEFAULT_CHARACTER);
const [error, setError] = useState(null);
// ElevenLabs state
const [elevenLabsApiKey, setElevenLabsApiKey] = useState(localStorage.getItem('elevenlabs_api_key') || '');
const [elevenLabsVoices, setElevenLabsVoices] = useState([]);
const [elevenLabsModels, setElevenLabsModels] = useState([]);
const [isLoadingElevenLabs, setIsLoadingElevenLabs] = useState(false);
const fetchElevenLabsData = async (key) => {
if (!key) return;
setIsLoadingElevenLabs(true);
try {
const headers = { 'xi-api-key': key };
const [voicesRes, modelsRes] = await Promise.all([
fetch('https://api.elevenlabs.io/v1/voices', { headers }),
fetch('https://api.elevenlabs.io/v1/models', { headers })
]);
if (!voicesRes.ok || !modelsRes.ok) {
throw new Error('Failed to fetch from ElevenLabs API (check API key)');
}
const voicesData = await voicesRes.json();
const modelsData = await modelsRes.json();
setElevenLabsVoices(voicesData.voices || []);
setElevenLabsModels(modelsData.filter(m => m.can_do_text_to_speech) || []);
localStorage.setItem('elevenlabs_api_key', key);
} catch (err) {
setError(err.message);
} finally {
setIsLoadingElevenLabs(false);
}
};
// Automatically fetch if key exists on load
React.useEffect(() => {
if (elevenLabsApiKey && character.tts.engine === 'elevenlabs') {
fetchElevenLabsData(elevenLabsApiKey);
}
}, [character.tts.engine]);
const handleExport = () => {
try {
validateCharacter(character);
setError(null);
const dataStr = "data:text/json;charset=utf-8," + encodeURIComponent(JSON.stringify(character, null, 2));
const downloadAnchorNode = document.createElement('a');
downloadAnchorNode.setAttribute("href", dataStr);
downloadAnchorNode.setAttribute("download", `${character.name || 'character'}.json`);
document.body.appendChild(downloadAnchorNode);
downloadAnchorNode.click();
downloadAnchorNode.remove();
} catch (err) {
setError(err.message);
}
};
const handleImport = (e) => {
const file = e.target.files[0];
if (!file) return;
const reader = new FileReader();
reader.onload = (e) => {
try {
const importedChar = JSON.parse(e.target.result);
validateCharacter(importedChar);
setCharacter(importedChar);
setError(null);
} catch (err) {
setError(`Import failed: ${err.message}`);
}
};
reader.readAsText(file);
};
const handleChange = (field, value) => {
setCharacter(prev => ({ ...prev, [field]: value }));
};
const handleNestedChange = (parent, field, value) => {
setCharacter(prev => ({
...prev,
[parent]: { ...prev[parent], [field]: value }
}));
};
const handleRuleChange = (index, field, value) => {
setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])];
newRules[index] = { ...newRules[index], [field]: value };
return { ...prev, custom_rules: newRules };
});
};
const addRule = () => {
setCharacter(prev => ({
...prev,
custom_rules: [...(prev.custom_rules || []), { trigger: "", response: "", condition: "" }]
}));
};
const removeRule = (index) => {
setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])];
newRules.splice(index, 1);
return { ...prev, custom_rules: newRules };
});
};
const previewTTS = async () => {
const text = `Hi, I am ${character.display_name}. This is a preview of my voice.`;
if (character.tts.engine === 'kokoro') {
try {
const response = await fetch('http://localhost:8081/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice: character.tts.kokoro_voice })
});
if (!response.ok) throw new Error('Failed to fetch Kokoro audio');
const blob = await response.blob();
const url = URL.createObjectURL(blob);
const audio = new Audio(url);
audio.play();
} catch (err) {
setError(`Kokoro preview failed: ${err.message}. Falling back to browser TTS.`);
runBrowserTTS(text);
}
} else {
runBrowserTTS(text);
}
};
const runBrowserTTS = (text) => {
const utterance = new SpeechSynthesisUtterance(text);
utterance.rate = character.tts.speed;
const voices = window.speechSynthesis.getVoices();
const preferredVoice = voices.find(v => v.lang.startsWith('en') && v.name.includes('Female')) || voices.find(v => v.lang.startsWith('en'));
if (preferredVoice) utterance.voice = preferredVoice;
window.speechSynthesis.cancel();
window.speechSynthesis.speak(utterance);
};
return (
<div className="max-w-4xl mx-auto p-6 space-y-6">
<div className="flex justify-between items-center">
<h1 className="text-3xl font-bold">HomeAI Character Manager</h1>
<div className="space-x-4">
<label className="cursor-pointer bg-blue-500 text-white px-4 py-2 rounded shadow hover:bg-blue-600">
Import JSON
<input type="file" accept=".json" className="hidden" onChange={handleImport} />
</label>
<button onClick={handleExport} className="bg-green-500 text-white px-4 py-2 rounded shadow hover:bg-green-600">
Export JSON
</button>
</div>
</div>
{error && <div className="bg-red-100 border border-red-400 text-red-700 px-4 py-3 rounded relative">{error}</div>}
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
<div className="space-y-4 bg-white p-4 shadow rounded">
<h2 className="text-xl font-semibold">Basic Info</h2>
<div>
<label className="block text-sm font-medium mb-1">Name (ID)</label>
<input type="text" className="w-full border p-2 rounded" value={character.name} onChange={(e) => handleChange('name', e.target.value)} />
</div>
<div>
<label className="block text-sm font-medium mb-1">Display Name</label>
<input type="text" className="w-full border p-2 rounded" value={character.display_name} onChange={(e) => handleChange('display_name', e.target.value)} />
</div>
<div>
<label className="block text-sm font-medium mb-1">Description</label>
<input type="text" className="w-full border p-2 rounded" value={character.description} onChange={(e) => handleChange('description', e.target.value)} />
</div>
</div>
<div className="space-y-4 bg-white p-4 shadow rounded">
<h2 className="text-xl font-semibold">TTS Configuration</h2>
<div>
<label className="block text-sm font-medium mb-1">Engine</label>
<select className="w-full border p-2 rounded" value={character.tts.engine} onChange={(e) => handleNestedChange('tts', 'engine', e.target.value)}>
<option value="kokoro">Kokoro</option>
<option value="chatterbox">Chatterbox</option>
<option value="qwen3">Qwen3</option>
<option value="elevenlabs">ElevenLabs</option>
</select>
</div>
{character.tts.engine === 'elevenlabs' && (
<div className="space-y-4 border p-4 rounded bg-gray-50">
<div>
<label className="block text-xs font-medium mb-1 text-gray-500">ElevenLabs API Key (Local Use Only)</label>
<div className="flex space-x-2">
<input type="password" placeholder="sk_..." className="w-full border p-2 rounded text-sm" value={elevenLabsApiKey} onChange={(e) => setElevenLabsApiKey(e.target.value)} />
<button onClick={() => fetchElevenLabsData(elevenLabsApiKey)} disabled={isLoadingElevenLabs} className="bg-blue-500 text-white px-3 py-1 rounded text-sm whitespace-nowrap hover:bg-blue-600 disabled:opacity-50">
{isLoadingElevenLabs ? 'Loading...' : 'Fetch'}
</button>
</div>
</div>
<div className="space-y-2">
<div>
<label className="block text-sm font-medium mb-1">Voice ID</label>
{elevenLabsVoices.length > 0 ? (
<select className="w-full border p-2 rounded" value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => handleNestedChange('tts', 'elevenlabs_voice_id', e.target.value)}>
<option value="">-- Select Voice --</option>
{elevenLabsVoices.map(v => (
<option key={v.voice_id} value={v.voice_id}>{v.name} ({v.category})</option>
))}
</select>
) : (
<input type="text" className="w-full border p-2 rounded" value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => handleNestedChange('tts', 'elevenlabs_voice_id', e.target.value)} placeholder="e.g. 21m00Tcm4TlvDq8ikWAM" />
)}
</div>
<div>
<label className="block text-sm font-medium mb-1">Model</label>
{elevenLabsModels.length > 0 ? (
<select className="w-full border p-2 rounded" value={character.tts.elevenlabs_model || 'eleven_monolingual_v1'} onChange={(e) => handleNestedChange('tts', 'elevenlabs_model', e.target.value)}>
<option value="">-- Select Model --</option>
{elevenLabsModels.map(m => (
<option key={m.model_id} value={m.model_id}>{m.name} ({m.model_id})</option>
))}
</select>
) : (
<input type="text" className="w-full border p-2 rounded" value={character.tts.elevenlabs_model || 'eleven_monolingual_v1'} onChange={(e) => handleNestedChange('tts', 'elevenlabs_model', e.target.value)} placeholder="e.g. eleven_monolingual_v1" />
)}
</div>
</div>
</div>
)}
{character.tts.engine === 'kokoro' && (
<div>
<label className="block text-sm font-medium mb-1">Kokoro Voice</label>
<select className="w-full border p-2 rounded" value={character.tts.kokoro_voice || 'af_heart'} onChange={(e) => handleNestedChange('tts', 'kokoro_voice', e.target.value)}>
<option value="af_heart">af_heart (American Female)</option>
<option value="af_alloy">af_alloy (American Female)</option>
<option value="af_aoede">af_aoede (American Female)</option>
<option value="af_bella">af_bella (American Female)</option>
<option value="af_jessica">af_jessica (American Female)</option>
<option value="af_kore">af_kore (American Female)</option>
<option value="af_nicole">af_nicole (American Female)</option>
<option value="af_nova">af_nova (American Female)</option>
<option value="af_river">af_river (American Female)</option>
<option value="af_sarah">af_sarah (American Female)</option>
<option value="af_sky">af_sky (American Female)</option>
<option value="am_adam">am_adam (American Male)</option>
<option value="am_echo">am_echo (American Male)</option>
<option value="am_eric">am_eric (American Male)</option>
<option value="am_fenrir">am_fenrir (American Male)</option>
<option value="am_liam">am_liam (American Male)</option>
<option value="am_michael">am_michael (American Male)</option>
<option value="am_onyx">am_onyx (American Male)</option>
<option value="am_puck">am_puck (American Male)</option>
<option value="am_santa">am_santa (American Male)</option>
<option value="bf_alice">bf_alice (British Female)</option>
<option value="bf_emma">bf_emma (British Female)</option>
<option value="bf_isabella">bf_isabella (British Female)</option>
<option value="bf_lily">bf_lily (British Female)</option>
<option value="bm_daniel">bm_daniel (British Male)</option>
<option value="bm_fable">bm_fable (British Male)</option>
<option value="bm_george">bm_george (British Male)</option>
<option value="bm_lewis">bm_lewis (British Male)</option>
</select>
</div>
)}
{character.tts.engine === 'chatterbox' && (
<div>
<label className="block text-sm font-medium mb-1">Voice Reference Path</label>
<input type="text" className="w-full border p-2 rounded" value={character.tts.voice_ref_path || ''} onChange={(e) => handleNestedChange('tts', 'voice_ref_path', e.target.value)} />
</div>
)}
<div>
<label className="block text-sm font-medium mb-1">Speed: {character.tts.speed}</label>
<input type="range" min="0.5" max="2.0" step="0.1" className="w-full" value={character.tts.speed} onChange={(e) => handleNestedChange('tts', 'speed', parseFloat(e.target.value))} />
</div>
<button
onClick={previewTTS}
className="w-full bg-indigo-500 text-white px-4 py-2 rounded shadow hover:bg-indigo-600 transition"
>
🔊 Preview Voice Speed
</button>
<p className="text-xs text-gray-500 mt-1">Note: Kokoro previews are fetched from the local bridge. Others use browser TTS for speed testing.</p>
</div>
</div>
<div className="bg-white p-4 shadow rounded space-y-2">
<div className="flex justify-between items-center">
<h2 className="text-xl font-semibold">System Prompt</h2>
<span className="text-xs text-gray-500">{character.system_prompt.length} chars</span>
</div>
<textarea
className="w-full border p-2 rounded h-32"
value={character.system_prompt}
onChange={(e) => handleChange('system_prompt', e.target.value)}
/>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
<div className="bg-white p-4 shadow rounded space-y-4">
<h2 className="text-xl font-semibold">Live2D Expressions</h2>
{Object.entries(character.live2d_expressions).map(([key, val]) => (
<div key={key} className="flex justify-between items-center">
<label className="text-sm font-medium w-1/3 capitalize">{key}</label>
<input type="text" className="border p-1 rounded w-2/3" value={val} onChange={(e) => handleNestedChange('live2d_expressions', key, e.target.value)} />
</div>
))}
</div>
<div className="bg-white p-4 shadow rounded space-y-4">
<h2 className="text-xl font-semibold">Model Overrides</h2>
<div>
<label className="block text-sm font-medium mb-1">Primary Model</label>
<select className="w-full border p-2 rounded" value={character.model_overrides?.primary || 'llama3.3:70b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}>
<option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen2.5:7b">qwen2.5:7b</option>
<option value="qwen3:32b">qwen3:32b</option>
<option value="codestral:22b">codestral:22b</option>
<option value="gemma-3-27b">gemma-3-27b</option>
<option value="DeepSeek-R1-8B">DeepSeek-R1-8B</option>
</select>
</div>
<div>
<label className="block text-sm font-medium mb-1">Fast Model</label>
<select className="w-full border p-2 rounded" value={character.model_overrides?.fast || 'qwen2.5:7b'} onChange={(e) => handleNestedChange('model_overrides', 'fast', e.target.value)}>
<option value="qwen2.5:7b">qwen2.5:7b</option>
<option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen3:32b">qwen3:32b</option>
<option value="codestral:22b">codestral:22b</option>
<option value="gemma-3-27b">gemma-3-27b</option>
<option value="DeepSeek-R1-8B">DeepSeek-R1-8B</option>
</select>
</div>
</div>
</div>
<div className="bg-white p-4 shadow rounded space-y-4">
<div className="flex justify-between items-center">
<h2 className="text-xl font-semibold">Custom Rules</h2>
<button onClick={addRule} className="bg-blue-500 text-white px-3 py-1 rounded text-sm hover:bg-blue-600">
+ Add Rule
</button>
</div>
{(!character.custom_rules || character.custom_rules.length === 0) ? (
<p className="text-sm text-gray-500 italic">No custom rules defined.</p>
) : (
<div className="space-y-4">
{character.custom_rules.map((rule, idx) => (
<div key={idx} className="border p-4 rounded relative bg-gray-50">
<button
onClick={() => removeRule(idx)}
className="absolute top-2 right-2 text-red-500 hover:text-red-700"
title="Remove Rule"
>
</button>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4 mt-2">
<div>
<label className="block text-xs font-medium mb-1">Trigger</label>
<input type="text" className="w-full border p-1 rounded text-sm" value={rule.trigger || ''} onChange={(e) => handleRuleChange(idx, 'trigger', e.target.value)} />
</div>
<div>
<label className="block text-xs font-medium mb-1">Condition (Optional)</label>
<input type="text" className="w-full border p-1 rounded text-sm" value={rule.condition || ''} onChange={(e) => handleRuleChange(idx, 'condition', e.target.value)} placeholder="e.g. time_of_day == morning" />
</div>
<div className="md:col-span-2">
<label className="block text-xs font-medium mb-1">Response</label>
<textarea className="w-full border p-1 rounded text-sm h-16" value={rule.response || ''} onChange={(e) => handleRuleChange(idx, 'response', e.target.value)} />
</div>
</div>
</div>
))}
</div>
)}
</div>
</div>
);
}

View File

@@ -0,0 +1,13 @@
import Ajv from 'ajv'
import schema from '../schema/character.schema.json'
const ajv = new Ajv({ allErrors: true, strict: false })
const validate = ajv.compile(schema)
export function validateCharacter(config) {
const valid = validate(config)
if (!valid) {
throw new Error(ajv.errorsText(validate.errors))
}
return true
}

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="35.93" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 228"><path fill="#00D8FF" d="M210.483 73.824a171.49 171.49 0 0 0-8.24-2.597c.465-1.9.893-3.777 1.273-5.621c6.238-30.281 2.16-54.676-11.769-62.708c-13.355-7.7-35.196.329-57.254 19.526a171.23 171.23 0 0 0-6.375 5.848a155.866 155.866 0 0 0-4.241-3.917C100.759 3.829 77.587-4.822 63.673 3.233C50.33 10.957 46.379 33.89 51.995 62.588a170.974 170.974 0 0 0 1.892 8.48c-3.28.932-6.445 1.924-9.474 2.98C17.309 83.498 0 98.307 0 113.668c0 15.865 18.582 31.778 46.812 41.427a145.52 145.52 0 0 0 6.921 2.165a167.467 167.467 0 0 0-2.01 9.138c-5.354 28.2-1.173 50.591 12.134 58.266c13.744 7.926 36.812-.22 59.273-19.855a145.567 145.567 0 0 0 5.342-4.923a168.064 168.064 0 0 0 6.92 6.314c21.758 18.722 43.246 26.282 56.54 18.586c13.731-7.949 18.194-32.003 12.4-61.268a145.016 145.016 0 0 0-1.535-6.842c1.62-.48 3.21-.974 4.76-1.488c29.348-9.723 48.443-25.443 48.443-41.52c0-15.417-17.868-30.326-45.517-39.844Zm-6.365 70.984c-1.4.463-2.836.91-4.3 1.345c-3.24-10.257-7.612-21.163-12.963-32.432c5.106-11 9.31-21.767 12.459-31.957c2.619.758 5.16 1.557 7.61 2.4c23.69 8.156 38.14 20.213 38.14 29.504c0 9.896-15.606 22.743-40.946 31.14Zm-10.514 20.834c2.562 12.94 2.927 24.64 1.23 33.787c-1.524 8.219-4.59 13.698-8.382 15.893c-8.067 4.67-25.32-1.4-43.927-17.412a156.726 156.726 0 0 1-6.437-5.87c7.214-7.889 14.423-17.06 21.459-27.246c12.376-1.098 24.068-2.894 34.671-5.345a134.17 134.17 0 0 1 1.386 6.193ZM87.276 214.515c-7.882 2.783-14.16 2.863-17.955.675c-8.075-4.657-11.432-22.636-6.853-46.752a156.923 156.923 0 0 1 1.869-8.499c10.486 2.32 22.093 3.988 34.498 4.994c7.084 9.967 14.501 19.128 21.976 27.15a134.668 134.668 0 0 1-4.877 4.492c-9.933 8.682-19.886 14.842-28.658 17.94ZM50.35 144.747c-12.483-4.267-22.792-9.812-29.858-15.863c-6.35-5.437-9.555-10.836-9.555-15.216c0-9.322 13.897-21.212 37.076-29.293c2.813-.98 5.757-1.905 8.812-2.773c3.204 10.42 7.406 21.315 12.477 32.332c-5.137 11.18-9.399 22.249-12.634 32.792a134.718 134.718 0 0 1-6.318-1.979Zm12.378-84.26c-4.811-24.587-1.616-43.134 6.425-47.789c8.564-4.958 27.502 2.111 47.463 19.835a144.318 144.318 0 0 1 3.841 3.545c-7.438 7.987-14.787 17.08-21.808 26.988c-12.04 1.116-23.565 2.908-34.161 5.309a160.342 160.342 0 0 1-1.76-7.887Zm110.427 27.268a347.8 347.8 0 0 0-7.785-12.803c8.168 1.033 15.994 2.404 23.343 4.08c-2.206 7.072-4.956 14.465-8.193 22.045a381.151 381.151 0 0 0-7.365-13.322Zm-45.032-43.861c5.044 5.465 10.096 11.566 15.065 18.186a322.04 322.04 0 0 0-30.257-.006c4.974-6.559 10.069-12.652 15.192-18.18ZM82.802 87.83a323.167 323.167 0 0 0-7.227 13.238c-3.184-7.553-5.909-14.98-8.134-22.152c7.304-1.634 15.093-2.97 23.209-3.984a321.524 321.524 0 0 0-7.848 12.897Zm8.081 65.352c-8.385-.936-16.291-2.203-23.593-3.793c2.26-7.3 5.045-14.885 8.298-22.6a321.187 321.187 0 0 0 7.257 13.246c2.594 4.48 5.28 8.868 8.038 13.147Zm37.542 31.03c-5.184-5.592-10.354-11.779-15.403-18.433c4.902.192 9.899.29 14.978.29c5.218 0 10.376-.117 15.453-.343c-4.985 6.774-10.018 12.97-15.028 18.486Zm52.198-57.817c3.422 7.8 6.306 15.345 8.596 22.52c-7.422 1.694-15.436 3.058-23.88 4.071a382.417 382.417 0 0 0 7.859-13.026a347.403 347.403 0 0 0 7.425-13.565Zm-16.898 8.101a358.557 358.557 0 0 1-12.281 19.815a329.4 329.4 0 0 1-23.444.823c-7.967 0-15.716-.248-23.178-.732a310.202 310.202 0 0 1-12.513-19.846h.001a307.41 307.41 0 0 1-10.923-20.627a310.278 310.278 0 0 1 10.89-20.637l-.001.001a307.318 307.318 0 0 1 12.413-19.761c7.613-.576 15.42-.876 23.31-.876H128c7.926 0 15.743.303 23.354.883a329.357 329.357 0 0 1 12.335 19.695a358.489 358.489 0 0 1 11.036 20.54a329.472 329.472 0 0 1-11 20.722Zm22.56-122.124c8.572 4.944 11.906 24.881 6.52 51.026c-.344 1.668-.73 3.367-1.15 5.09c-10.622-2.452-22.155-4.275-34.23-5.408c-7.034-10.017-14.323-19.124-21.64-27.008a160.789 160.789 0 0 1 5.888-5.4c18.9-16.447 36.564-22.941 44.612-18.3ZM128 90.808c12.625 0 22.86 10.235 22.86 22.86s-10.235 22.86-22.86 22.86s-22.86-10.235-22.86-22.86s10.235-22.86 22.86-22.86Z"></path></svg>

After

Width:  |  Height:  |  Size: 4.0 KiB

View File

@@ -0,0 +1 @@
@import "tailwindcss";

View File

@@ -0,0 +1,10 @@
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App.jsx'
createRoot(document.getElementById('root')).render(
<StrictMode>
<App />
</StrictMode>,
)

View File

@@ -0,0 +1,11 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
// https://vite.dev/config/
export default defineConfig({
plugins: [
tailwindcss(),
react(),
],
})

View File

@@ -13,7 +13,7 @@
<string>--wake-word</string>
<string>hey_jarvis</string>
<string>--notify-url</string>
<string>http://localhost:8080/wake</string>
<string>http://localhost:8081/wake</string>
</array>
<key>RunAtLoad</key>

View File

@@ -0,0 +1,28 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.wyoming-elevenlabs</string>
<key>ProgramArguments</key>
<array>
<string>/Users/aodhan/homeai-voice-env/bin/python3</string>
<string>/Users/aodhan/gitea/homeai/homeai-voice/tts/wyoming_elevenlabs_server.py</string>
<string>--uri</string>
<string>tcp://0.0.0.0:10302</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-wyoming-elevenlabs.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-wyoming-elevenlabs.log</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
</dict>
</dict>
</plist>

View File

@@ -18,9 +18,9 @@
<string>--area</string>
<string>Living Room</string>
<string>--mic-command</string>
<string>rec -q -r 16000 -c 1 -b 16 -t raw -</string>
<string>/opt/homebrew/bin/rec -q -r 16000 -c 1 -b 16 -t raw -</string>
<string>--snd-command</string>
<string>play -q -r 24000 -c 1 -b 16 -t raw -</string>
<string>/opt/homebrew/bin/play -q -t raw -r 24000 -c 1 -b 16 -e signed-integer -</string>
<string>--mic-command-rate</string>
<string>16000</string>
<string>--mic-command-width</string>
@@ -33,10 +33,18 @@
<string>2</string>
<string>--snd-command-channels</string>
<string>1</string>
<string>--wake-command</string>
<string>/Users/aodhan/homeai-voice-env/bin/python3 /Users/aodhan/gitea/homeai/homeai-voice/wyoming/wakeword_command.py --wake-word hey_jarvis --threshold 0.5</string>
<string>--wake-command-rate</string>
<string>16000</string>
<string>--wake-command-width</string>
<string>2</string>
<string>--wake-command-channels</string>
<string>1</string>
<string>--awake-wav</string>
<string>/System/Library/Sounds/Glass.aiff</string>
<string>/Users/aodhan/homeai-data/sounds/awake.wav</string>
<string>--done-wav</string>
<string>/System/Library/Sounds/Blow.aiff</string>
<string>/Users/aodhan/homeai-data/sounds/done.wav</string>
<string>--no-zeroconf</string>
</array>

View File

@@ -0,0 +1,10 @@
#!/bin/bash
# Monitor wake word detection in real-time
echo "Monitoring wake word detection..."
echo "Say 'Hey Jarvis' to test"
echo "Press Ctrl+C to stop"
echo ""
# Watch both the wake word log and bridge log
tail -f /tmp/homeai-wakeword-error.log /tmp/homeai-openclaw-bridge.log 2>/dev/null | grep -E "(Wake word detected|Listening|Failed to notify)"

View File

@@ -0,0 +1,186 @@
#!/usr/bin/env python3
"""Wyoming TTS server backed by ElevenLabs.
Usage:
python wyoming_elevenlabs_server.py --uri tcp://0.0.0.0:10302 --voice-id 21m00Tcm4TlvDq8ikWAM
"""
import argparse
import asyncio
import logging
import os
import wave
import io
from urllib import request, error
from wyoming.audio import AudioChunk, AudioStart, AudioStop
from wyoming.event import Event
from wyoming.info import Attribution, Info, TtsProgram, TtsVoice, TtsVoiceSpeaker
from wyoming.server import AsyncEventHandler, AsyncServer
from wyoming.tts import Synthesize
_LOGGER = logging.getLogger(__name__)
SAMPLE_RATE = 24000
SAMPLE_WIDTH = 2 # int16
CHANNELS = 1
CHUNK_SECONDS = 1 # stream in 1-second chunks
class ElevenLabsEventHandler(AsyncEventHandler):
def __init__(self, default_voice_id: str, default_model: str, api_key: str, speed: float, *args, **kwargs):
super().__init__(*args, **kwargs)
self._default_voice_id = default_voice_id
self._default_model = default_model
self._api_key = api_key
self._speed = speed
# Send info immediately on connect
asyncio.ensure_future(self._send_info())
async def _send_info(self):
info = Info(
tts=[
TtsProgram(
name="elevenlabs",
description="ElevenLabs API TTS",
attribution=Attribution(
name="ElevenLabs",
url="https://elevenlabs.io/",
),
installed=True,
version="1.0.0",
voices=[
TtsVoice(
name=self._default_voice_id,
description="ElevenLabs Voice",
attribution=Attribution(name="elevenlabs", url=""),
installed=True,
languages=["en-us"],
version="1.0",
speakers=[TtsVoiceSpeaker(name=self._default_voice_id)],
)
],
)
]
)
await self.write_event(info.event())
async def handle_event(self, event: Event) -> bool:
if Synthesize.is_type(event.type):
synthesize = Synthesize.from_event(event)
text = synthesize.text
voice_id = self._default_voice_id
if synthesize.voice and synthesize.voice.name:
voice_id = synthesize.voice.name
_LOGGER.debug("Synthesizing %r with voice_id=%s model=%s", text, voice_id, self._default_model)
try:
loop = asyncio.get_event_loop()
audio_bytes = await loop.run_in_executor(
None, lambda: self._call_elevenlabs_api(text, voice_id)
)
if audio_bytes is None:
raise Exception("Failed to generate audio from ElevenLabs")
await self.write_event(
AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event()
)
chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS
for i in range(0, len(audio_bytes), chunk_size):
await self.write_event(
AudioChunk(
rate=SAMPLE_RATE,
width=SAMPLE_WIDTH,
channels=CHANNELS,
audio=audio_bytes[i : i + chunk_size],
).event()
)
await self.write_event(AudioStop().event())
_LOGGER.info("Synthesized audio completed")
except Exception:
_LOGGER.exception("Synthesis error")
await self.write_event(AudioStop().event())
return True # keep connection open
def _call_elevenlabs_api(self, text: str, voice_id: str) -> bytes:
import json
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}?output_format=pcm_24000"
headers = {
"Accept": "audio/pcm",
"Content-Type": "application/json",
"xi-api-key": self._api_key
}
data = {
"text": text,
"model_id": self._default_model,
}
req = request.Request(url, data=json.dumps(data).encode('utf-8'), headers=headers, method='POST')
try:
with request.urlopen(req) as response:
if response.status == 200:
return response.read()
else:
_LOGGER.error(f"ElevenLabs API Error: {response.status}")
return None
except error.HTTPError as e:
_LOGGER.error(f"ElevenLabs HTTP Error: {e.code} - {e.read().decode('utf-8')}")
return None
except Exception as e:
_LOGGER.error(f"ElevenLabs Request Error: {str(e)}")
return None
async def main():
parser = argparse.ArgumentParser()
parser.add_argument("--uri", default="tcp://0.0.0.0:10302")
parser.add_argument("--voice-id", default="21m00Tcm4TlvDq8ikWAM", help="Default ElevenLabs Voice ID")
parser.add_argument("--model", default="eleven_monolingual_v1", help="ElevenLabs Model ID")
parser.add_argument("--speed", type=float, default=1.0)
parser.add_argument("--debug", action="store_true")
args = parser.parse_args()
logging.basicConfig(
level=logging.DEBUG if args.debug else logging.INFO,
format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
api_key = os.environ.get("ELEVENLABS_API_KEY")
if not api_key:
# Try to read from .env file directly if not exported in shell
try:
env_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), '.env')
if os.path.exists(env_path):
with open(env_path, 'r') as f:
for line in f:
if line.startswith('ELEVENLABS_API_KEY='):
api_key = line.split('=', 1)[1].strip()
break
except Exception:
pass
if not api_key:
_LOGGER.warning("ELEVENLABS_API_KEY environment variable not set. API calls will fail.")
_LOGGER.info("Starting ElevenLabs Wyoming TTS on %s (voice-id=%s, model=%s)", args.uri, args.voice_id, args.model)
server = AsyncServer.from_uri(args.uri)
def handler_factory(reader, writer):
return ElevenLabsEventHandler(args.voice_id, args.model, api_key, args.speed, reader, writer)
await server.run(handler_factory)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,77 @@
#!/usr/bin/env python3
"""Wake word detection command for Wyoming Satellite.
The satellite feeds raw 16kHz 16-bit mono audio via stdin.
This script reads that audio, runs openWakeWord, and prints
the wake word name to stdout when detected.
Usage (called by wyoming-satellite --wake-command):
python wakeword_command.py [--wake-word hey_jarvis] [--threshold 0.5]
"""
import argparse
import sys
import numpy as np
import logging
_LOGGER = logging.getLogger(__name__)
SAMPLE_RATE = 16000
CHUNK_SIZE = 1280 # ~80ms at 16kHz — recommended by openWakeWord
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--wake-word", default="hey_jarvis")
parser.add_argument("--threshold", type=float, default=0.5)
parser.add_argument("--cooldown", type=float, default=3.0)
parser.add_argument("--debug", action="store_true")
args = parser.parse_args()
logging.basicConfig(
level=logging.DEBUG if args.debug else logging.WARNING,
format="%(asctime)s %(levelname)s %(message)s",
stream=sys.stderr,
)
import openwakeword
from openwakeword.model import Model
oww = Model(
wakeword_models=[args.wake_word],
inference_framework="onnx",
)
import time
last_trigger = 0.0
bytes_per_chunk = CHUNK_SIZE * 2 # 16-bit = 2 bytes per sample
_LOGGER.debug("Wake word command ready, reading audio from stdin")
try:
while True:
raw = sys.stdin.buffer.read(bytes_per_chunk)
if not raw:
break
if len(raw) < bytes_per_chunk:
# Pad with zeros if short read
raw = raw + b'\x00' * (bytes_per_chunk - len(raw))
chunk = np.frombuffer(raw, dtype=np.int16)
oww.predict(chunk)
for ww, scores in oww.prediction_buffer.items():
score = scores[-1] if scores else 0.0
if score >= args.threshold:
now = time.time()
if now - last_trigger >= args.cooldown:
last_trigger = now
# Print wake word name to stdout — satellite reads this
print(ww, flush=True)
_LOGGER.debug("Wake word detected: %s (score=%.3f)", ww, score)
except (KeyboardInterrupt, BrokenPipeError):
pass
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,92 @@
# P5: HomeAI Character System Development Plan
> Created: 2026-03-07 | Phase: 3 - Agent & Character
## Overview
Phase 5 (P5) focuses on creating a unified, JSON-based character configuration system that serves as the single source of truth for the AI assistant's personality, voice, visual expressions, and behavioral rules. This configuration will be consumed by OpenClaw (P4), the Voice Pipeline (P3), and the Visual Layer (P7).
A key component of this phase is building the **Character Manager UI**—a local React application that provides a user-friendly interface for editing character definitions, validating them against a strict JSON schema, and exporting them for use by the agent.
---
## 1. Schema & Foundation
The first step is establishing the strict data contract that all other services will rely on.
### 1.1 Define Character Schema
- Create `homeai-character/schema/character.schema.json` (v1).
- Define required fields: `schema_version`, `name`, `system_prompt`, `tts`.
- Define optional/advanced fields: `model_overrides`, `live2d_expressions`, `vtube_ws_triggers`, `custom_rules`, `notes`.
- Document the schema in `homeai-character/schema/README.md`.
### 1.2 Create Default Character Profile
- Create `homeai-character/characters/aria.json` conforming to the schema.
- Define the default system prompt for "Aria" (warm, helpful, concise for smart home tasks).
- Configure default TTS settings (`engine: "kokoro"`, `kokoro_voice: "af_heart"`).
- Add placeholder mappings for `live2d_expressions` and `vtube_ws_triggers`.
---
## 2. Character Manager UI Development
Transform the existing prototype (`character-manager.jsx`) into a fully functional local web tool.
### 2.1 Project Initialization
- Scaffold a new Vite + React project in `homeai-character/src/`.
- Install necessary dependencies: `react`, `react-dom`, `ajv` (for schema validation), and styling utilities (e.g., Tailwind CSS).
- Migrate the existing `character-manager.jsx` into the new project structure.
### 2.2 Schema Validation Integration
- Implement `SchemaValidator.js` using `ajv` to validate character configurations against `character.schema.json`.
- Enforce validation checks before allowing the user to export or save a character profile.
- Display clear error messages in the UI if validation fails.
### 2.3 UI Feature Implementation
- **Basic Info & Prompt Editor:** Fields for name, description, and a multi-line editor for the system prompt (with character count).
- **TTS Configuration:** Dropdowns for engine selection (Kokoro, Chatterbox, Qwen3) and inputs for voice reference paths/speed.
- **Expression Mapping Table:** UI to map semantic states (idle, listening, thinking, speaking, etc.) to VTube Studio hotkey IDs.
- **Custom Rules Editor:** Interface to add, edit, and delete trigger/response/condition pairs.
- **Import/Export Pipeline:** Functionality to load an existing JSON file, edit it, and download/save the validated output.
---
## 3. Pipeline Integration (Wiring it up)
Ensure that the generated character configurations are actually used by the rest of the HomeAI ecosystem.
### 3.1 OpenClaw Integration (P4 Link)
- Configure OpenClaw to load the active character from `~/.openclaw/characters/aria.json`.
- Modify OpenClaw's initialization to inject the `system_prompt` from the JSON into Ollama requests.
- Implement schema version checking in OpenClaw (fail gracefully if `schema_version` is unsupported).
- Ensure OpenClaw supports hot-reloading if the character JSON is updated.
### 3.2 Voice Pipeline Integration (P3 Link)
- Update the TTS dispatch logic to read the `tts` configuration block from the character JSON.
- Dynamically route TTS requests based on the `engine` field (e.g., routing to Kokoro vs. Chatterbox).
---
## 4. Custom Voice Cloning (Optional/Advanced)
If moving beyond the default Kokoro voice, set up a custom voice clone.
### 4.1 Audio Processing
- Record 30-60 seconds of clean reference audio for the character (`~/voices/aria-raw.wav`).
- Pre-process the audio using FFmpeg: `ffmpeg -i aria-raw.wav -ar 22050 -ac 1 aria.wav`.
- Move the processed file to the designated directory (`~/voices/aria.wav`).
### 4.2 Configuration & Testing
- Update `aria.json` to use `"engine": "chatterbox"` and set `"voice_ref_path"` to the new audio file.
- Test the voice output. If the quality is insufficient, evaluate Qwen3-TTS as a fallback alternative.
---
## Success Criteria Checklist
- [ ] `character.schema.json` is fully defined and documented.
- [ ] `aria.json` is created and passes strict validation against the schema.
- [ ] Vite-based Character Manager UI runs locally without errors.
- [ ] Character Manager successfully imports, edits, validates, and exports character JSONs.
- [ ] OpenClaw successfully reads `aria.json` and applies the system prompt to LLM generation.
- [ ] TTS engine selection dynamically respects the configuration in the character JSON.
- [ ] (Optional) Custom voice reference audio is processed and tested.