6.4 KiB
TTS Quality Controls
Overview
EVE now includes comprehensive voice quality controls allowing you to customize speed, stability, and clarity of text-to-speech output.
Controls Available
1. Speed Control (All Voices)
Range: 0.25x - 4.0x
Default: 1.0x (Normal)
Applies to: Both Browser TTS and ElevenLabs
- 0.25x - 0.75x: Slower speech, good for learning or understanding complex content
- 1.0x: Natural speaking pace
- 1.25x - 2.0x: Faster speech, efficient for experienced listeners
- 2.0x - 4.0x: Very fast, for quickly scanning content
2. Stability Control (ElevenLabs Only)
Range: 0% - 100%
Default: 50%
Applies to: ElevenLabs voices only
What it does:
- Controls consistency vs expressiveness of the voice
- Higher values = more consistent, predictable delivery
- Lower values = more varied, emotional, expressive
When to adjust:
- High (70-100%): Audiobooks, technical content, professional narration
- Medium (40-60%): General conversation, balanced approach
- Low (0-30%): Character voices, dramatic readings, creative content
3. Clarity Control (ElevenLabs Only)
Range: 0% - 100%
Default: 75%
Applies to: ElevenLabs voices only
What it does:
- Controls similarity boost / voice clarity enhancement
- Higher values = closer to original voice, enhanced clarity
- Lower values = more variation, creative interpretation
When to adjust:
- High (70-100%): Maximum clarity, important information, professional use
- Medium (50-70%): Natural balance
- Low (0-40%): More creative interpretation, character variation
User Interface
Location
Settings > Voice Settings > Voice Quality Settings
Design
-
Speed: Full-width slider with 0.25 step increments
- Shows current value in label (e.g., "Speed: 1.50x")
- Visual markers at 0.25x, 1.0x, and 4.0x
-
Stability: Full-width slider with 5% step increments
- Shows percentage in label (e.g., "Stability: 50%")
- Disabled (grayed out) when using browser voices
- Helpful description below slider
-
Clarity: Full-width slider with 5% step increments
- Shows percentage in label (e.g., "Clarity: 75%")
- Disabled (grayed out) when using browser voices
- Helpful description below slider
Smart UI Features
- ElevenLabs-only controls show "(ElevenLabs only)" in label
- Controls are disabled when browser voice is selected
- Real-time value display as you drag sliders
- Settings persist across sessions
- All controls visible even when disabled for easy reference
Technical Implementation
Settings Store
ttsSpeed: number // 0.25 to 4.0
ttsStability: number // 0.0 to 1.0
ttsSimilarityBoost: number // 0.0 to 1.0
Usage in TTS
await ttsManager.speak(text, {
voiceId: selectedVoice,
volume: 1.0,
rate: ttsSpeed, // Browser TTS rate
stability: ttsStability, // ElevenLabs stability
similarityBoost: ttsSimilarityBoost // ElevenLabs clarity
})
Provider-Specific Application
Browser TTS:
- Uses
rateparameter from speed control - Ignores stability and similarity boost (not applicable)
ElevenLabs TTS:
- Applies all three parameters
- Speed can be adjusted post-processing if needed
- Stability and similarity boost sent directly to API
Examples
For Audiobooks
Speed: 1.0x - 1.25x (comfortable listening)
Stability: 80% (consistent narration)
Clarity: 85% (clear pronunciation)
For Casual Chat
Speed: 1.0x (natural pace)
Stability: 50% (balanced)
Clarity: 75% (good clarity)
For Quick Scanning
Speed: 2.0x - 3.0x (fast playback)
Stability: 60% (maintain clarity at speed)
Clarity: 90% (maximum clarity for comprehension)
For Character Voices
Speed: 0.75x - 1.0x (theatrical pacing)
Stability: 20% (high expressiveness)
Clarity: 50% (allow variation)
Benefits
✅ Personalization - Adjust voice to your preferences
✅ Accessibility - Slower speeds for comprehension
✅ Efficiency - Faster speeds for quick consumption
✅ Quality Control - Fine-tune ElevenLabs voice output
✅ Flexibility - Different settings for different use cases
✅ Universal - Speed works on all voices, premium controls for ElevenLabs
Persistence
All settings are:
- ✅ Saved to localStorage
- ✅ Persist across app restarts
- ✅ Applied automatically to all future TTS playback
- ✅ Can be changed at any time
Future Enhancements
Potential Additions
- Pitch control for browser TTS
- Volume control per-voice
- Per-voice presets (save favorite settings for each voice)
- Quick presets (Audiobook, Podcast, Speed Reader, etc.)
- Real-time adjustment while audio is playing
- A/B comparison to test settings side-by-side
Advanced Features
- Voice EQ for fine-tuning frequency response
- Emotion control for ElevenLabs (happy, sad, excited, etc.)
- Speaking style selection (narration, conversation, etc.)
- Prosody controls (emphasis, pauses, intonation)
Troubleshooting
Sliders Not Responsive
- Check that voice is enabled
- Verify a voice is selected
- Try refreshing the settings panel
ElevenLabs Controls Disabled
- Make sure an ElevenLabs voice is selected (starts with "elevenlabs:")
- Browser voices won't enable these controls (by design)
- Check that ElevenLabs API key is configured
Settings Not Saving
- Check browser localStorage permissions
- Try clearing cache and reloading
- Verify settings store is persisting
Speed Not Applying
- Browser TTS: Rate should change immediately
- ElevenLabs: Speed adjustment may vary by voice
- Try values between 0.5x - 2.0x for best results
Testing
To Test Speed Control
- Enable TTS
- Adjust speed slider
- Click speaker icon on a message
- Voice should speak at selected speed
To Test ElevenLabs Controls
- Select an ElevenLabs voice
- Adjust stability slider
- Adjust clarity slider
- Click speaker icon
- Notice difference in voice quality
To Test Persistence
- Adjust all sliders
- Close settings
- Restart app
- Open settings
- Values should be preserved
Recommended Settings
Default (Balanced):
- Speed: 1.0x
- Stability: 50%
- Clarity: 75%
Professional:
- Speed: 1.0x
- Stability: 80%
- Clarity: 85%
Expressive:
- Speed: 1.0x
- Stability: 30%
- Clarity: 60%
Fast Listener:
- Speed: 1.75x
- Stability: 65%
- Clarity: 90%
Status: ✅ Complete
Version: v0.2.0-rc
Date: October 5, 2025