# ElevenLabs TTS Models ## Overview EVE uses **ElevenLabs Turbo v2.5** by default for text-to-speech. This model is specifically optimized for real-time conversational AI. ## ⚠️ Important: V3 Alpha Not Recommended for EVE According to [ElevenLabs documentation](https://elevenlabs.io/docs/models#eleven-v3-alpha): > **"Eleven v3 is not made for real-time applications like Agents Platform."** While V3 offers the highest quality, it is: - ❌ **Not optimized for real-time** conversation - ❌ **Higher latency** - Slower response times - ❌ **Requires multiple generations** - Need to generate several versions and pick the best - ✅ **Best for**: Audiobooks, character discussions, pre-recorded content ## Current Default Model **Default**: `eleven_turbo_v2_5` This model is optimized for EVE and provides: - ✅ Fast generation speed - ✅ High-quality natural voices - ✅ Low latency for real-time conversation (~100-300ms) - ✅ Cost-effective - ✅ Multilingual support - ✅ **Recommended by ElevenLabs for conversational AI** ## Available Models ElevenLabs offers several models you can use: ### Turbo Models (Recommended) **`eleven_turbo_v2_5`** (Current Default) - Latest turbo model - Excellent quality with fast generation - Best for conversational AI - Low latency **`eleven_turbo_v2`** - Previous turbo version - Still high quality - Slightly older technology ### Multilingual Models **`eleven_multilingual_v2`** - Supports 29+ languages - High quality across languages - Slower than turbo but more versatile **`eleven_multilingual_v1`** - Original multilingual model - Stable and reliable - Good for non-English content ### Monolingual Models **`eleven_monolingual_v1`** - English only - High quality - Original ElevenLabs model - More expensive than turbo ### Flash Models **`eleven_flash_v2_5`** - Ultra-fast generation - Lowest latency - Good quality - Best for real-time applications **`eleven_flash_v2`** - Previous flash version - Very fast - Lower cost ## Changing the Model The model is configurable in the settings store: ```typescript // In settingsStore.ts ttsModel: 'eleven_turbo_v2_5' // Default ``` To change: ```typescript setTtsModel('eleven_flash_v2_5') // For lower latency setTtsModel('eleven_multilingual_v2') // For better multilingual support ``` ## Model Characteristics ### Speed Comparison 1. **Flash** - Fastest (< 300ms) 2. **Turbo** - Very Fast (< 500ms) 3. **Multilingual** - Fast (< 1s) 4. **Monolingual** - Standard (1-2s) ### Quality Comparison 1. **Monolingual** - Highest quality 2. **Turbo v2.5** - Excellent quality 3. **Multilingual v2** - Great quality 4. **Flash** - Good quality ### Cost Comparison 1. **Flash** - Most economical 2. **Turbo** - Cost-effective 3. **Multilingual** - Standard pricing 4. **Monolingual** - Premium pricing ## Recommended Use Cases ### Real-Time Conversation (Default) ``` Model: eleven_turbo_v2_5 Speed: 1.0x Stability: 50% Clarity: 75% ``` Best balance for EVE assistant ### Ultra-Low Latency ``` Model: eleven_flash_v2_5 Speed: 1.0x Stability: 60% Clarity: 80% ``` For instant responses ### Maximum Quality ``` Model: eleven_monolingual_v1 Speed: 1.0x Stability: 70% Clarity: 85% ``` For professional content ### Multilingual ``` Model: eleven_multilingual_v2 Speed: 1.0x Stability: 55% Clarity: 75% ``` For non-English languages ## Technical Details ### API Call ```typescript await client.textToSpeech.convert(voiceId, { text: "Hello, how can I help you?", model_id: "eleven_turbo_v2_5", voice_settings: { stability: 0.5, similarity_boost: 0.75, style: 0.0, use_speaker_boost: true } }) ``` ### Model Selection Flow 1. User sends message 2. EVE responds 3. User clicks 🔊 speaker icon 4. TTSControls reads `ttsModel` from settings 5. Passes to TTS Manager 6. TTS Manager calls ElevenLabs with model ID 7. Audio generated and played ### Fallback Behavior If ElevenLabs model fails or is unavailable: - Falls back to Browser Web Speech API - Logs warning in console - Continues with free browser TTS ## Future Enhancements ### Planned Features - **Model selector in UI** - Dropdown to choose model in Settings - **Auto-detect best model** - Based on language and use case - **Model presets** - Quick selection for different scenarios - **Cost tracking** - Show estimated cost per request - **Quality metrics** - User feedback on voice quality ### Potential Models As ElevenLabs releases new models, EVE can be updated: - `eleven_turbo_v3` - Next generation turbo - `eleven_flash_v3` - Even faster flash model - `eleven_multilingual_v3` - Improved multilingual - Specialized models for specific use cases ## Troubleshooting ### Audio Not Playing - Check that ElevenLabs API key is valid - Verify model ID is correct - Check console for error messages - Try switching to `eleven_turbo_v2` if v2.5 fails ### Poor Quality - Try `eleven_monolingual_v1` for better quality - Adjust stability and clarity settings - Check voice selection - Ensure text is well-formatted ### Slow Generation - Switch to `eleven_flash_v2_5` for speed - Reduce text length - Check network connection - Verify API quota not exceeded ### Model Not Found Error ``` Error: Model 'eleven_turbo_v3' not found ``` - Model ID may be incorrect - Model might not be available on your plan - Fall back to `eleven_turbo_v2_5` - Check ElevenLabs documentation ## Model Changelog ### v2.5 Models (Current) - Released: 2024 - Improvements: Better quality, faster generation - Models: `eleven_turbo_v2_5`, `eleven_flash_v2_5` ### v2 Models - Released: 2023 - Improvements: Multilingual support, reduced latency - Models: `eleven_turbo_v2`, `eleven_flash_v2`, `eleven_multilingual_v2` ### v1 Models (Legacy) - Released: 2022-2023 - Original high-quality models - Models: `eleven_monolingual_v1`, `eleven_multilingual_v1` ## References - [ElevenLabs Models Documentation](https://elevenlabs.io/docs/api-reference/text-to-speech) - [Model Comparison Guide](https://elevenlabs.io/docs/models) - [Pricing Information](https://elevenlabs.io/pricing) --- **Current Default**: `eleven_turbo_v2_5` **Status**: ✅ Configured **Version**: v0.2.0-rc **Date**: October 5, 2025