# TTS Quality Controls ## Overview EVE now includes comprehensive voice quality controls allowing you to customize speed, stability, and clarity of text-to-speech output. ## Controls Available ### 1. Speed Control (All Voices) **Range**: 0.25x - 4.0x **Default**: 1.0x (Normal) **Applies to**: Both Browser TTS and ElevenLabs - **0.25x - 0.75x**: Slower speech, good for learning or understanding complex content - **1.0x**: Natural speaking pace - **1.25x - 2.0x**: Faster speech, efficient for experienced listeners - **2.0x - 4.0x**: Very fast, for quickly scanning content ### 2. Stability Control (ElevenLabs Only) **Range**: 0% - 100% **Default**: 50% **Applies to**: ElevenLabs voices only **What it does**: - Controls consistency vs expressiveness of the voice - Higher values = more consistent, predictable delivery - Lower values = more varied, emotional, expressive **When to adjust**: - **High (70-100%)**: Audiobooks, technical content, professional narration - **Medium (40-60%)**: General conversation, balanced approach - **Low (0-30%)**: Character voices, dramatic readings, creative content ### 3. Clarity Control (ElevenLabs Only) **Range**: 0% - 100% **Default**: 75% **Applies to**: ElevenLabs voices only **What it does**: - Controls similarity boost / voice clarity enhancement - Higher values = closer to original voice, enhanced clarity - Lower values = more variation, creative interpretation **When to adjust**: - **High (70-100%)**: Maximum clarity, important information, professional use - **Medium (50-70%)**: Natural balance - **Low (0-40%)**: More creative interpretation, character variation ## User Interface ### Location Settings > Voice Settings > Voice Quality Settings ### Design - **Speed**: Full-width slider with 0.25 step increments - Shows current value in label (e.g., "Speed: 1.50x") - Visual markers at 0.25x, 1.0x, and 4.0x - **Stability**: Full-width slider with 5% step increments - Shows percentage in label (e.g., "Stability: 50%") - Disabled (grayed out) when using browser voices - Helpful description below slider - **Clarity**: Full-width slider with 5% step increments - Shows percentage in label (e.g., "Clarity: 75%") - Disabled (grayed out) when using browser voices - Helpful description below slider ### Smart UI Features - ElevenLabs-only controls show "(ElevenLabs only)" in label - Controls are disabled when browser voice is selected - Real-time value display as you drag sliders - Settings persist across sessions - All controls visible even when disabled for easy reference ## Technical Implementation ### Settings Store ```typescript ttsSpeed: number // 0.25 to 4.0 ttsStability: number // 0.0 to 1.0 ttsSimilarityBoost: number // 0.0 to 1.0 ``` ### Usage in TTS ```typescript await ttsManager.speak(text, { voiceId: selectedVoice, volume: 1.0, rate: ttsSpeed, // Browser TTS rate stability: ttsStability, // ElevenLabs stability similarityBoost: ttsSimilarityBoost // ElevenLabs clarity }) ``` ### Provider-Specific Application **Browser TTS**: - Uses `rate` parameter from speed control - Ignores stability and similarity boost (not applicable) **ElevenLabs TTS**: - Applies all three parameters - Speed can be adjusted post-processing if needed - Stability and similarity boost sent directly to API ## Examples ### For Audiobooks ``` Speed: 1.0x - 1.25x (comfortable listening) Stability: 80% (consistent narration) Clarity: 85% (clear pronunciation) ``` ### For Casual Chat ``` Speed: 1.0x (natural pace) Stability: 50% (balanced) Clarity: 75% (good clarity) ``` ### For Quick Scanning ``` Speed: 2.0x - 3.0x (fast playback) Stability: 60% (maintain clarity at speed) Clarity: 90% (maximum clarity for comprehension) ``` ### For Character Voices ``` Speed: 0.75x - 1.0x (theatrical pacing) Stability: 20% (high expressiveness) Clarity: 50% (allow variation) ``` ## Benefits ✅ **Personalization** - Adjust voice to your preferences ✅ **Accessibility** - Slower speeds for comprehension ✅ **Efficiency** - Faster speeds for quick consumption ✅ **Quality Control** - Fine-tune ElevenLabs voice output ✅ **Flexibility** - Different settings for different use cases ✅ **Universal** - Speed works on all voices, premium controls for ElevenLabs ## Persistence All settings are: - ✅ Saved to localStorage - ✅ Persist across app restarts - ✅ Applied automatically to all future TTS playback - ✅ Can be changed at any time ## Future Enhancements ### Potential Additions - **Pitch control** for browser TTS - **Volume control** per-voice - **Per-voice presets** (save favorite settings for each voice) - **Quick presets** (Audiobook, Podcast, Speed Reader, etc.) - **Real-time adjustment** while audio is playing - **A/B comparison** to test settings side-by-side ### Advanced Features - **Voice EQ** for fine-tuning frequency response - **Emotion control** for ElevenLabs (happy, sad, excited, etc.) - **Speaking style** selection (narration, conversation, etc.) - **Prosody controls** (emphasis, pauses, intonation) ## Troubleshooting ### Sliders Not Responsive - Check that voice is enabled - Verify a voice is selected - Try refreshing the settings panel ### ElevenLabs Controls Disabled - Make sure an ElevenLabs voice is selected (starts with "elevenlabs:") - Browser voices won't enable these controls (by design) - Check that ElevenLabs API key is configured ### Settings Not Saving - Check browser localStorage permissions - Try clearing cache and reloading - Verify settings store is persisting ### Speed Not Applying - Browser TTS: Rate should change immediately - ElevenLabs: Speed adjustment may vary by voice - Try values between 0.5x - 2.0x for best results ## Testing ### To Test Speed Control 1. Enable TTS 2. Adjust speed slider 3. Click speaker icon on a message 4. Voice should speak at selected speed ### To Test ElevenLabs Controls 1. Select an ElevenLabs voice 2. Adjust stability slider 3. Adjust clarity slider 4. Click speaker icon 5. Notice difference in voice quality ### To Test Persistence 1. Adjust all sliders 2. Close settings 3. Restart app 4. Open settings 5. Values should be preserved ## Recommended Settings **Default (Balanced)**: - Speed: 1.0x - Stability: 50% - Clarity: 75% **Professional**: - Speed: 1.0x - Stability: 80% - Clarity: 85% **Expressive**: - Speed: 1.0x - Stability: 30% - Clarity: 60% **Fast Listener**: - Speed: 1.75x - Stability: 65% - Clarity: 90% --- **Status**: ✅ Complete **Version**: v0.2.0-rc **Date**: October 5, 2025