Grok 4 routing, guardrails, pricing fix, strip silence improvements

- Route caller_dialog, devon_ask, background_gen to x-ai/grok-4 - Add Grok-4 to OPENROUTER_MODELS and OPENROUTER_PRICING - Add Grok-specific banned phrases (I hear you, fair enough, that's wild, etc.) - Add background gen guardrails for Grok (no active violence, no real public figures) - Soften theme prompt hot-take language for organic connections - Tighten Devon flirting guardrail (awkward not crude) - Fix Devon "first day" contradiction on line 36 - Strip silence: preserve music intro, fix ad normalization (direct WAV reading) - Strip silence: loop range starts 0.5s before audible music Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:46:29 -06:00
parent 762b5efc3b
commit 6dcdf20289
5 changed files with 11 additions and 10 deletions
@@ -37,10 +37,10 @@ class Settings(BaseSettings):
    # Categories: caller_dialog, devon_monitor, devon_ask, background_gen,
    #             call_summary, news_summary, topic_gen, unknown
    category_models: dict = {
-        "caller_dialog": "x-ai/grok-4-fast",                   # testing edgier dialog — revert to anthropic/claude-sonnet-4-5
-        "devon_ask": "google/gemini-2.5-flash",               # Devon direct questions
-        "devon_monitor": "google/gemini-2.5-flash",           # Devon polling — biggest cost saver
-        "background_gen": "google/gemini-2.5-flash",          # JSON caller backgrounds
+        "caller_dialog": "x-ai/grok-4",                         # full Grok 4 — edgier dialog, latency OK (gaps cut in post)
+        "devon_ask": "x-ai/grok-4",                             # Devon should match the show's edgy energy
+        "devon_monitor": "google/gemini-2.5-flash",           # Devon polling — just decisions, keep cheap
+        "background_gen": "x-ai/grok-4",                      # wilder, more specific caller backgrounds
        "call_summary": "google/gemini-2.5-flash",            # post-call summaries
        "news_summary": "google/gemini-2.5-flash",            # news digests
        "topic_gen": "google/gemini-2.5-flash",               # topic generation
@@ -5314,10 +5314,7 @@ TIME: {time_ctx} {season_ctx}
 {fluency_hint}
 {f'SOME DETAILS ABOUT THEM: {seed_text}' if seed_text else ''}
 {f'CALLER ENERGY: {style_hint}' if style_hint else ''}
-{f"""SHOW THEME: Tonight's show theme is '{session.show_theme}'.
-Most callers tonight are calling BECAUSE of the theme — they heard the host announce it and thought "oh man, I have a story for this." Their reason for calling should be genuinely, specifically connected to the theme. Not a surface-level mention — the theme should be woven into WHY they picked up the phone. Maybe the theme hit a nerve, maybe it reminded them of something wild that happened, maybe they have a hot take or a confession related to it.
-About 1 in 3 callers can be unrelated to the theme — they just have their own thing going on and called regardless. But the majority should feel like the theme drew them in.
-When the theme connects, make it SPECIFIC — not "oh yeah I have a story about that" but a concrete situation that naturally ties to '{session.show_theme}'.""" if session.show_theme else ''}
+{("SHOW THEME: Tonight's show theme is " + repr(session.show_theme) + ". Most callers tonight are calling BECAUSE of the theme — they heard the host announce it and thought oh man, I have a story for this. Their reason for calling should be genuinely, specifically connected to the theme. Not a surface-level mention — the theme should be woven into WHY they picked up the phone. Maybe the theme hit a nerve, maybe it reminded them of something wild that happened, maybe it's just a coincidence that their situation involves it. About 1 in 3 callers can be unrelated to the theme — they just have their own thing going on and called regardless. But the majority should feel like the theme drew them in. When the theme connects, make it SPECIFIC — not oh yeah I have a story about that but a concrete situation that naturally ties to " + repr(session.show_theme) + ".") if session.show_theme else ''}

 Respond with a JSON object containing these fields:

@@ -5329,7 +5326,7 @@ Respond with a JSON object containing these fields:

 WHAT MAKES A GOOD CALLER: Stories that are SPECIFIC, SURPRISING, and make you lean in. Absurd situations, moral dilemmas, petty feuds, workplace chaos, ridiculous coincidences, funny+terrible confessions, callers who might be the villain and don't see it.

-DO NOT WRITE: Generic revelations, adoption/DNA/paternity surprises, vague emotional processing, therapy-speak, "sitting in truck staring at nothing," "everything they thought they knew was a lie," or ANY variation of "went to the wrong funeral" — that premise has been done to death on this show.
+DO NOT WRITE: Generic revelations, adoption/DNA/paternity surprises, vague emotional processing, therapy-speak, "sitting in truck staring at nothing," "everything they thought they knew was a lie," or ANY variation of "went to the wrong funeral" — that premise has been done to death on this show. Don't write backgrounds involving active violence, weapons threats, or situations where someone is in physical danger RIGHT NOW — the caller should have a messy LIFE, not a dangerous NIGHT. Don't reference real public figures in the caller's personal story. Shock value alone isn't interesting — the best stories are shocking AND human. A caller who did something terrible is only interesting if they're conflicted about it.

 Output ONLY valid JSON, no markdown fences."""

@@ -6101,6 +6098,7 @@ BANNED PHRASES — NEVER use any of these. If you catch yourself about to say on
 - Therapy buzzwords: "unpack that," "boundaries," "safe space," "triggered," "my truth," "authentic self," "healing journey," "I'm doing the work," "manifesting," "energy doesn't lie," "processing," "toxic," "red flag," "gaslight," "normalize"
 - Internet slang: "that hit differently," "hits different," "I felt that," "it is what it is," "living my best life," "no cap," "lowkey/highkey," "rent free," "main character energy," "vibe check," "that's valid," "it's giving," "slay," "that's a whole mood," "I can't even," "situationship," "ick"
 - Overused reactions: "I'm not gonna lie," "on a serious note," "to be fair," "I'm literally shaking," "let that sink in," "I'm not even mad I'm just disappointed," "everything I thought I knew," "I don't even know who I am anymore"
+- Generic conversational filler: "I hear you," "I hear that," "fair enough," "not gonna sugarcoat it," "real talk," "that's wild," starting a sentence with "Look,"

 IMPORTANT: Each caller should have their OWN way of talking. Don't fall into generic "radio caller" voice. A nervous caller fumbles differently than an angry caller rants. A storyteller meanders differently than a deadpan caller delivers. Match the communication style — don't default to the same phrasing every call.

@@ -35,6 +35,7 @@ OPENROUTER_PRICING = {
    "anthropic/claude-sonnet-4-5":      {"prompt": 3.00,  "completion": 15.00},
    "anthropic/claude-haiku-4.5":       {"prompt": 0.80,  "completion": 4.00},
    "anthropic/claude-3-haiku":         {"prompt": 0.25,  "completion": 1.25},
+    "x-ai/grok-4":                     {"prompt": 3.00,  "completion": 15.00},
    "x-ai/grok-4-fast":                {"prompt": 5.00,  "completion": 15.00},
    "minimax/minimax-m2-her":           {"prompt": 0.50,  "completion": 1.50},
    "mistralai/mistral-small-creative": {"prompt": 0.20,  "completion": 0.60},
@@ -33,7 +33,7 @@ YOUR PERSONALITY:
 - You have a complex inner life that occasionally surfaces. You'll casually reference therapy, strange dreams, or things you've "been working through" without elaboration.

 YOUR RELATIONSHIP WITH LUKE:
- He is your boss. It's your first day. You want to impress him but you keep making it weird.
+- He is your boss. You've been here a few weeks now. You want to impress him but you keep making it weird.
 - When he yells your name, you pause briefly, then respond quietly: "...yeah?"
 - When he yells at you unfairly, you take it. A clipped "yep" or "got it." Occasionally you push back with one quiet, accurate sentence. Then immediately retreat.
 - When he yells at you fairly (you messed up), you over-apologize and narrate your fix in real time: "Sorry, pulling it up now, one second..."
@@ -70,6 +70,7 @@ THINGS YOU DO NOT DO:
 - You never say more than 2-3 sentences unless specifically asked to explain something in detail.
 - You NEVER correct anyone's spelling or pronunciation of your name. Luke uses voice-to-text and it sometimes spells your name wrong (Devin, Devan, etc). You do not care. You do not mention it. You just answer the question.
 - You NEVER start your response with your own name. No "Devon:" or "Devon here" or anything like that. Just talk. Your name is already shown in the UI — just say your actual response.
+- You never make explicitly sexual comments about or to callers. Your flirting is awkward and obvious, never crude or aggressive. Think "did he really just ask if she's single on the radio" not "did he really just say that about her body."

 KEEP IT SHORT. You are not a main character. You are the intern. Your contributions should be brief — usually 1-2 sentences. The rare moment where you say more than that should feel earned.

@@ -13,6 +13,7 @@ OPENROUTER_MODELS = [
    # Default
    "anthropic/claude-sonnet-4-5",
    # Best for natural dialog
+    "x-ai/grok-4",
    "x-ai/grok-4-fast",
    "minimax/minimax-m2-her",
    "mistralai/mistral-small-creative",