feat: upgrade voice pipeline — MLX Whisper STT (20x faster), Qwen3.5 MoE LLM, fix HA tool calling
- Replace faster-whisper with wyoming-mlx-whisper (whisper-large-v3-turbo, MLX Metal GPU) STT latency: 8.4s → 400ms for short voice commands - Add Qwen3.5-35B-A3B (MoE, 3B active params, Q8_0) to Ollama — 26.7 tok/s vs 5.4 tok/s (70B) - Add model preload launchd service to pin voice model in VRAM permanently - Fix HA tool calling: set commands.native=true, symlink ha-ctl to PATH - Add pipeline benchmark script (STT/LLM/TTS latency profiling) - Add service restart buttons and STT endpoint to dashboard - Bind Vite dev server to 0.0.0.0 for LAN access Total estimated pipeline latency: ~27s → ~4s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,21 +8,11 @@
|
||||
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>/Users/aodhan/homeai-voice-env/bin/wyoming-faster-whisper</string>
|
||||
<string>/Users/aodhan/homeai-whisper-mlx-env/bin/wyoming-mlx-whisper</string>
|
||||
<string>--uri</string>
|
||||
<string>tcp://0.0.0.0:10300</string>
|
||||
<string>--model</string>
|
||||
<string>large-v3</string>
|
||||
<string>--language</string>
|
||||
<string>en</string>
|
||||
<string>--device</string>
|
||||
<string>cpu</string>
|
||||
<string>--compute-type</string>
|
||||
<string>int8</string>
|
||||
<string>--data-dir</string>
|
||||
<string>/Users/aodhan/models/whisper</string>
|
||||
<string>--download-dir</string>
|
||||
<string>/Users/aodhan/models/whisper</string>
|
||||
</array>
|
||||
|
||||
<key>RunAtLoad</key>
|
||||
|
||||
Reference in New Issue
Block a user