feat: upgrade voice pipeline — MLX Whisper STT (20x faster), Qwen3.5 MoE LLM, fix HA tool calling

- Replace faster-whisper with wyoming-mlx-whisper (whisper-large-v3-turbo, MLX Metal GPU)
  STT latency: 8.4s → 400ms for short voice commands
- Add Qwen3.5-35B-A3B (MoE, 3B active params, Q8_0) to Ollama — 26.7 tok/s vs 5.4 tok/s (70B)
- Add model preload launchd service to pin voice model in VRAM permanently
- Fix HA tool calling: set commands.native=true, symlink ha-ctl to PATH
- Add pipeline benchmark script (STT/LLM/TTS latency profiling)
- Add service restart buttons and STT endpoint to dashboard
- Bind Vite dev server to 0.0.0.0 for LAN access

Total estimated pipeline latency: ~27s → ~4s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-13 18:03:12 +00:00
parent 1bfd7fbd08
commit af6b7bd945
10 changed files with 721 additions and 27 deletions

View File

@@ -2,6 +2,14 @@
# Copy to .env and fill in your values.
# .env is gitignored — never commit it.
# ─── API Keys ──────────────────────────────────────────────────────────────────
HUGGING_FACE_API_KEY=
OPENROUTER_API_KEY=
OPENAI_API_KEY=
DEEPSEEK_API_KEY=
GEMINI_API_KEY=
ELEVENLABS_API_KEY=
# ─── Data & Paths ──────────────────────────────────────────────────────────────
DATA_DIR=${HOME}/homeai-data
REPO_DIR=${HOME}/Projects/HomeAI
@@ -45,3 +53,4 @@ VTUBE_WS_URL=ws://localhost:8001
# ─── P8: Images ────────────────────────────────────────────────────────────────
COMFYUI_URL=http://localhost:8188