Files
eve-alpha/docs/integrations/elevenlabs/ELEVENLABS_MODELS.md
Aodhan Collins 66749a5ce7 Initial commit
2025-10-06 00:33:04 +01:00

6.1 KiB

ElevenLabs TTS Models

Overview

EVE uses ElevenLabs Turbo v2.5 by default for text-to-speech. This model is specifically optimized for real-time conversational AI.

According to ElevenLabs documentation:

"Eleven v3 is not made for real-time applications like Agents Platform."

While V3 offers the highest quality, it is:

  • Not optimized for real-time conversation
  • Higher latency - Slower response times
  • Requires multiple generations - Need to generate several versions and pick the best
  • Best for: Audiobooks, character discussions, pre-recorded content

Current Default Model

Default: eleven_turbo_v2_5

This model is optimized for EVE and provides:

  • Fast generation speed
  • High-quality natural voices
  • Low latency for real-time conversation (~100-300ms)
  • Cost-effective
  • Multilingual support
  • Recommended by ElevenLabs for conversational AI

Available Models

ElevenLabs offers several models you can use:

eleven_turbo_v2_5 (Current Default)

  • Latest turbo model
  • Excellent quality with fast generation
  • Best for conversational AI
  • Low latency

eleven_turbo_v2

  • Previous turbo version
  • Still high quality
  • Slightly older technology

Multilingual Models

eleven_multilingual_v2

  • Supports 29+ languages
  • High quality across languages
  • Slower than turbo but more versatile

eleven_multilingual_v1

  • Original multilingual model
  • Stable and reliable
  • Good for non-English content

Monolingual Models

eleven_monolingual_v1

  • English only
  • High quality
  • Original ElevenLabs model
  • More expensive than turbo

Flash Models

eleven_flash_v2_5

  • Ultra-fast generation
  • Lowest latency
  • Good quality
  • Best for real-time applications

eleven_flash_v2

  • Previous flash version
  • Very fast
  • Lower cost

Changing the Model

The model is configurable in the settings store:

// In settingsStore.ts
ttsModel: 'eleven_turbo_v2_5' // Default

To change:

setTtsModel('eleven_flash_v2_5') // For lower latency
setTtsModel('eleven_multilingual_v2') // For better multilingual support

Model Characteristics

Speed Comparison

  1. Flash - Fastest (< 300ms)
  2. Turbo - Very Fast (< 500ms)
  3. Multilingual - Fast (< 1s)
  4. Monolingual - Standard (1-2s)

Quality Comparison

  1. Monolingual - Highest quality
  2. Turbo v2.5 - Excellent quality
  3. Multilingual v2 - Great quality
  4. Flash - Good quality

Cost Comparison

  1. Flash - Most economical
  2. Turbo - Cost-effective
  3. Multilingual - Standard pricing
  4. Monolingual - Premium pricing

Real-Time Conversation (Default)

Model: eleven_turbo_v2_5
Speed: 1.0x
Stability: 50%
Clarity: 75%

Best balance for EVE assistant

Ultra-Low Latency

Model: eleven_flash_v2_5
Speed: 1.0x
Stability: 60%
Clarity: 80%

For instant responses

Maximum Quality

Model: eleven_monolingual_v1
Speed: 1.0x
Stability: 70%
Clarity: 85%

For professional content

Multilingual

Model: eleven_multilingual_v2
Speed: 1.0x
Stability: 55%
Clarity: 75%

For non-English languages

Technical Details

API Call

await client.textToSpeech.convert(voiceId, {
  text: "Hello, how can I help you?",
  model_id: "eleven_turbo_v2_5",
  voice_settings: {
    stability: 0.5,
    similarity_boost: 0.75,
    style: 0.0,
    use_speaker_boost: true
  }
})

Model Selection Flow

  1. User sends message
  2. EVE responds
  3. User clicks 🔊 speaker icon
  4. TTSControls reads ttsModel from settings
  5. Passes to TTS Manager
  6. TTS Manager calls ElevenLabs with model ID
  7. Audio generated and played

Fallback Behavior

If ElevenLabs model fails or is unavailable:

  • Falls back to Browser Web Speech API
  • Logs warning in console
  • Continues with free browser TTS

Future Enhancements

Planned Features

  • Model selector in UI - Dropdown to choose model in Settings
  • Auto-detect best model - Based on language and use case
  • Model presets - Quick selection for different scenarios
  • Cost tracking - Show estimated cost per request
  • Quality metrics - User feedback on voice quality

Potential Models

As ElevenLabs releases new models, EVE can be updated:

  • eleven_turbo_v3 - Next generation turbo
  • eleven_flash_v3 - Even faster flash model
  • eleven_multilingual_v3 - Improved multilingual
  • Specialized models for specific use cases

Troubleshooting

Audio Not Playing

  • Check that ElevenLabs API key is valid
  • Verify model ID is correct
  • Check console for error messages
  • Try switching to eleven_turbo_v2 if v2.5 fails

Poor Quality

  • Try eleven_monolingual_v1 for better quality
  • Adjust stability and clarity settings
  • Check voice selection
  • Ensure text is well-formatted

Slow Generation

  • Switch to eleven_flash_v2_5 for speed
  • Reduce text length
  • Check network connection
  • Verify API quota not exceeded

Model Not Found Error

Error: Model 'eleven_turbo_v3' not found
  • Model ID may be incorrect
  • Model might not be available on your plan
  • Fall back to eleven_turbo_v2_5
  • Check ElevenLabs documentation

Model Changelog

v2.5 Models (Current)

  • Released: 2024
  • Improvements: Better quality, faster generation
  • Models: eleven_turbo_v2_5, eleven_flash_v2_5

v2 Models

  • Released: 2023
  • Improvements: Multilingual support, reduced latency
  • Models: eleven_turbo_v2, eleven_flash_v2, eleven_multilingual_v2

v1 Models (Legacy)

  • Released: 2022-2023
  • Original high-quality models
  • Models: eleven_monolingual_v1, eleven_multilingual_v1

References


Current Default: eleven_turbo_v2_5
Status: Configured
Version: v0.2.0-rc
Date: October 5, 2025