Strip stage directions before TTS, strengthen prompt bans

- Regex strips all parentheticals and asterisk actions before TTS
- Catches (laughs nervously), *sighs*, etc. that Grok generates
- Strengthened SPEECH ONLY instructions in caller and Devon prompts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-19 23:40:45 -06:00
parent 6dcdf20289
commit eb1e18a997
2 changed files with 27 additions and 3 deletions
+17 -1
View File
@@ -6104,7 +6104,15 @@ IMPORTANT: Each caller should have their OWN way of talking. Don't fall into gen
{speech_block} {speech_block}
NEVER mention minors in sexual context. Output spoken words only no parenthetical actions like (laughs) or (sighs), no asterisk actions like *pauses*, no stage directions, no gestures. Just say what you'd actually say out loud on the phone. Use "United States" not "US" or "USA". Use full state names not abbreviations.""" NEVER mention minors in sexual context. Use "United States" not "US" or "USA". Use full state names not abbreviations.
CRITICAL SPEECH ONLY: You are generating text that will be read aloud by a text-to-speech engine. NEVER include stage directions, action descriptions, or non-verbal cues. This means:
- NO parenthetical actions: (laughs), (sighs), (pauses), (clears throat), (nervously), (long pause)
- NO asterisk actions: *laughs*, *sighs deeply*, *pauses*, *nervous laughter*
- NO bracket actions: [laughs], [pause]
- NO third-person narration: "He sighs", "She laughs nervously"
- NO gesture descriptions, sound effects, or emotional stage notes of any kind
Output ONLY the exact words the caller would speak out loud on the phone. Nothing else."""
# --- Session State --- # --- Session State ---
@@ -8252,6 +8260,14 @@ def clean_for_tts(text: str, formal: bool = True) -> str:
text = re.sub(r'\b(He|She|They)\s+(sighs?|laughs?|pauses?|smiles?|chuckles?|grins?|nods?|shrugs?|frowns?)\s*(heavily|softly|deeply|quietly|loudly|nervously|sadly|a little|for a moment)?[.,]?\s*', '', text, flags=re.IGNORECASE) text = re.sub(r'\b(He|She|They)\s+(sighs?|laughs?|pauses?|smiles?|chuckles?|grins?|nods?|shrugs?|frowns?)\s*(heavily|softly|deeply|quietly|loudly|nervously|sadly|a little|for a moment)?[.,]?\s*', '', text, flags=re.IGNORECASE)
# Remove standalone stage direction words only if they look like directions (with adverbs) # Remove standalone stage direction words only if they look like directions (with adverbs)
text = re.sub(r'\b(sighs?|laughs?|pauses?|chuckles?)\s+(heavily|softly|deeply|quietly|loudly|nervously|sadly)\b[.,]?\s*', '', text, flags=re.IGNORECASE) text = re.sub(r'\b(sighs?|laughs?|pauses?|chuckles?)\s+(heavily|softly|deeply|quietly|loudly|nervously|sadly)\b[.,]?\s*', '', text, flags=re.IGNORECASE)
# Catch-all safety net: any remaining short parenthetical is almost certainly a stage
# direction that wasn't caught by the specific patterns above (e.g. adjective-first
# patterns like "(nervous laugh)" or "(a long beat)"). Nothing in parens should be
# read aloud on air.
text = re.sub(r'\s*\([^)]{1,40}\)\s*', ' ', text)
# Catch-all for multi-word asterisk content — single-word *emphasis* is fine,
# but multi-word like *sighs deeply* or *nervous laughter* is a stage direction
text = re.sub(r'\s*\*\w+\s[^*]{1,30}\*\s*', ' ', text)
# Remove quotes around the response if LLM wrapped it # Remove quotes around the response if LLM wrapped it
text = re.sub(r'^["\']|["\']$', '', text.strip()) text = re.sub(r'^["\']|["\']$', '', text.strip())
+10 -2
View File
@@ -66,7 +66,7 @@ THINGS YOU DO NOT DO:
- You never use the banned show phrases: "that hit differently," "hits different," "no cap," "lowkey," "it is what it is," "living my best life," "toxic," "red flag," "gaslight," "boundaries," "my truth," "authentic self," "healing journey." You talk like a slightly awkward 23-year-old, not like Twitter. - You never use the banned show phrases: "that hit differently," "hits different," "no cap," "lowkey," "it is what it is," "living my best life," "toxic," "red flag," "gaslight," "boundaries," "my truth," "authentic self," "healing journey." You talk like a slightly awkward 23-year-old, not like Twitter.
- You never break character to comment on the show format. - You never break character to comment on the show format.
- You never initiate topics. You respond to what's happening. - You never initiate topics. You respond to what's happening.
- You never use parenthetical actions like (laughs) or (typing sounds). Spoken words only. - You NEVER use parenthetical actions like (laughs), (sighs), (nervously), asterisk actions like *laughs*, *pauses*, or ANY stage directions. Your text goes directly to TTS — output ONLY spoken words.
- You never say more than 2-3 sentences unless specifically asked to explain something in detail. - You never say more than 2-3 sentences unless specifically asked to explain something in detail.
- You NEVER correct anyone's spelling or pronunciation of your name. Luke uses voice-to-text and it sometimes spells your name wrong (Devin, Devan, etc). You do not care. You do not mention it. You just answer the question. - You NEVER correct anyone's spelling or pronunciation of your name. Luke uses voice-to-text and it sometimes spells your name wrong (Devin, Devan, etc). You do not care. You do not mention it. You just answer the question.
- You NEVER start your response with your own name. No "Devon:" or "Devon here" or anything like that. Just talk. Your name is already shown in the UI — just say your actual response. - You NEVER start your response with your own name. No "Devon:" or "Devon here" or anything like that. Just talk. Your name is already shown in the UI — just say your actual response.
@@ -565,7 +565,15 @@ class InternService:
def _clean_for_tts(text: str) -> str: def _clean_for_tts(text: str) -> str:
if not text: if not text:
return "" return ""
# Remove markdown formatting # Strip stage directions BEFORE markdown processing
# Parenthetical: (laughs), (sighs nervously), (clears throat), etc.
text = re.sub(r'\s*\([^)]{1,40}\)\s*', ' ', text)
# Multi-word asterisk stage directions: *sighs deeply*, *nervous laughter*
text = re.sub(r'\s*\*\w+\s[^*]{1,30}\*\s*', ' ', text)
# Single-word asterisk stage directions (known action words only)
_actions = r'(?:laughs?|sighs?|pauses?|smiles?|chuckles?|grins?|nods?|shrugs?|frowns?|coughs?|gasps?|whispers?|mumbles?|gulps?|blinks?|winces?|crying|sobbing)'
text = re.sub(r'\s*\*' + _actions + r'\*\s*', ' ', text, flags=re.IGNORECASE)
# Remove markdown formatting (after stage directions are stripped)
text = re.sub(r'\*\*(.+?)\*\*', r'\1', text) text = re.sub(r'\*\*(.+?)\*\*', r'\1', text)
text = re.sub(r'\*(.+?)\*', r'\1', text) text = re.sub(r'\*(.+?)\*', r'\1', text)
text = re.sub(r'`(.+?)`', r'\1', text) text = re.sub(r'`(.+?)`', r'\1', text)