33 Commits

Author SHA1 Message Date
luke 376265eec7 Show quality fixes + preflight check
Ep47 post-mortem: fixed theme ignored by callers (backgrounds now
regenerate when theme is set), style-to-model race condition (fallback
to sonnet instead of pool[0]), removed bad pronunciation fixes, added
age-awareness to voice matching, raised MIN_RESPONSE_WORDS to 50.

Swapped problematic model mappings: conspiracy→qwen, know_it_all→mistral,
quiet_nervous→llama, emotional→kimi.

Added GET /api/show/preflight endpoint with 4 checks: model diversity,
theme penetration, voice-age alignment, response coherence (2-exchange
simulation of all callers). Frontend preflight modal with expandable
check cards.

Fixed active caller button not highlighting (moved highlight code before
potentially-failing caller info panel code).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 01:17:34 -06:00
luke f3c91fc385 Devon personality + Whisper name fix + music vocal filtering
- Devon: more conversational when addressed directly (500 tokens, 3-5 sentences)
- Devon: monitor prompt rewritten to encourage more contributions
- Devon: polling interval 15s (was 30s), removed 2-message minimum
- Whisper: no fuzzy name matching for 3-char names, require first letter match
- fetch_music.py: post-fetch vocal detection filter using musicinfo tags
- scan_music_vocals.py: new script to scan existing library for vocal tracks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:59:03 -06:00
luke c69c2ad532 Fix tonight's show issues: Whisper bias, boring callers, Devon, short responses
- Remove caller names from Whisper hint (was corrupting transcriptions)
- Background gen switched to Claude Sonnet 4.6 (cheap models = thin backgrounds)
- "WHAT MAKES A GOOD CALLER" rewritten with concrete examples
- Grok guardrails loosened (were cutting too much edge)
- Response length guidance added to caller prompt
- Retry under-20-word responses once for more detail
- Devon monitor softened from "default silence" to balanced
- Ban stalling phrases: "where was I", "as I was saying", etc.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 05:21:23 -06:00
luke 8dbbd92d3a Fix returning caller eligibility — 1+ calls, not 2+
The 2+ requirement created a catch-22: regulars couldn't return because they
needed 2 calls, but couldn't get a second call without returning. Dynamic
count already prevents flooding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:42:53 -06:00
luke fa36f8d184 Dynamic returning caller count — need 3+ eligible for variety
Only inject 2 returners if pool has 3+ eligible (so it's not the same every show).
With 2 eligible, inject 1. With 1 or 0, inject none.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:38:47 -06:00
luke 794ad98cf0 Replace music dropdown with genre quick-select buttons
- One-click genre buttons play random track from that genre
- Active genre highlighted, now-playing bar shows track name
- Only genres with tracks shown, crossfade on genre switch
- M key replays active genre or picks random

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:34:44 -06:00
luke f5eabd7dc4 Add fetch_music.py (Jamendo API) + expand genre keywords
- Downloads instrumental tracks from Jamendo by genre (jazz, lofi, blues, ambient, etc.)
- Filters: no vocals, 60-300s, sorted by popularity
- Saves to music/ with genre tags, tracks attribution
- Add genre keywords: ambient, chill, acoustic, classical, country, electronic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:18:26 -06:00
luke f717edeacb Fix style map key mismatch — API uses 'map', frontend was using 'style_map'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:45:21 -06:00
luke 56607879ee Fix style-matched dropdowns — populate from full model list, not just pool
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:42:53 -06:00
luke fcefabdaee Expand style-matched routing to 10 models for maximum caller variety
- Grok 4.1 Fast: high_energy, bragger, comedian, small_town_gossip
- Grok 4 Full: confrontational (needs deep reasoning for arguments)
- Claude Sonnet 4.6: quiet_nervous, emotional (genuine vulnerability)
- Kimi K2: sweet_earnest (warm, creative, different texture than Claude)
- Mistral Large: deadpan, mysterious (dry, precise)
- DeepSeek Chat: angry_venting (raw, unfiltered rage)
- DeepSeek R1 Distill: oversharer, conspiracy (commits fully, no hedging)
- Qwen: storyteller, rambling (loves tangents and detail)
- Gemini 2.5 Pro: know_it_all (pedantic, cites sources)
- Llama 3.3 70B: world_weary, reluctant, first_time (casual, natural)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:40:09 -06:00
luke 58495d2c75 Fix stale model detection — validate against current OPENROUTER_MODELS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:37:21 -06:00
luke 51961dc19b Fix stale model map detection — check if all values are same model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:33:47 -06:00
luke c516402402 Update model routing with latest OpenRouter models
Style-matched defaults:
- Grok 4.1 Fast for edgy callers (high_energy, confrontational, comedian etc.)
- Claude Sonnet 4.6 for emotional callers (quiet_nervous, sweet_earnest, emotional)
- Mistral Large 2512 for deadpan/mysterious/world-weary
- DeepSeek R1 Distill for storyteller/oversharer/conspiracy/rambler
- Gemini 2.5 Flash for know_it_all
- Llama 3.3 70B for first_time/reluctant callers

Category routing: Grok 4.1 Fast for dialog/devon/backgrounds, Gemini Flash for monitor/summary
Updated OPENROUTER_MODELS and OPENROUTER_PRICING with all new models

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:31:33 -06:00
luke e614599650 Fix checkpoint restoring stale caller model defaults
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:20:32 -06:00
luke d36de95577 Default caller model strategy to style_matched
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:18:36 -06:00
luke 0147be4e0c Normalization diagnostics + SFX track support
- Detailed logging for normalize_track_items (item count, RMS, gain, applied/skipped)
- Add SFX track normalization (track 5)
- Will reveal why ad/ident normalization silently fails

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:14:34 -06:00
luke 390f138601 Devon improvements: independent audio, realism overhaul
- Devon audio independent of caller hangup (separate stop events)
- Personal anecdotes capped at ~30% of responses (was every time)
- Interjection criteria tightened ("default is silence")
- Devon sees his own recent history to avoid repeating info
- Response variety: permits minimal reactions, confusion, silence
- Monitor prompt rewritten to be gatekeeping, not encouraging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:08:22 -06:00
luke 9eaf2fe5e3 Fix avatar misgendering, returning caller overflow, false callbacks
- Avatar prefetch checks gender marker, re-fetches on mismatch
- Returning callers need 2+ actual calls before re-eligible (was 1)
- Promotion rate lowered 10% → 5% to prevent pool flooding
- Callback injection skipped for returning callers (already have context)
- Show history clarifies "you are NOT that caller" to prevent identity confusion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:03:07 -06:00
luke 314d5f9452 Caller model routing — cycle, style-matched, mid-show override
- Three strategies: single model, cycle through pool, style-matched
- 18 communication styles mapped to 7 models (Grok, Sonnet, Mistral, Qwen, DeepSeek, Gemini, Llama)
- Per-caller model locked for entire call, overridable mid-show
- Model badges on caller buttons and info panel
- Settings UI for strategy, pool, style mapping, fallback
- Fallback to Sonnet on model failure
- 6 new models added to pricing and dropdown
- Checkpoint persistence for all model state

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:58:03 -06:00
luke e0fb3cac68 Make make_clips.py resilient — timeouts, retries, skip-on-failure
- 60s timeout + retry on all LLM calls
- 120-300s timeout on all subprocess/ffmpeg calls
- Per-clip error isolation (one failure doesn't kill the run)
- Progress indicators for each clip being processed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 17:36:41 -06:00
luke 4589670b37 Fix Whisper misspelling caller names — hint + fuzzy correction
- Pass all caller names as Whisper initial_prompt hint for correct spelling
- Post-transcription fuzzy match corrects remaining misspellings (Levenshtein)
- Prevents AI callers from "correcting" the host on their own name

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 07:42:18 -06:00
luke eb1e18a997 Strip stage directions before TTS, strengthen prompt bans
- Regex strips all parentheticals and asterisk actions before TTS
- Catches (laughs nervously), *sighs*, etc. that Grok generates
- Strengthened SPEECH ONLY instructions in caller and Devon prompts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 23:40:45 -06:00
luke 6dcdf20289 Grok 4 routing, guardrails, pricing fix, strip silence improvements
- Route caller_dialog, devon_ask, background_gen to x-ai/grok-4
- Add Grok-4 to OPENROUTER_MODELS and OPENROUTER_PRICING
- Add Grok-specific banned phrases (I hear you, fair enough, that's wild, etc.)
- Add background gen guardrails for Grok (no active violence, no real public figures)
- Soften theme prompt hot-take language for organic connections
- Tighten Devon flirting guardrail (awkward not crude)
- Fix Devon "first day" contradiction on line 36
- Strip silence: preserve music intro, fix ad normalization (direct WAV reading)
- Strip silence: loop range starts 0.5s before audible music

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:46:29 -06:00
luke 762b5efc3b Strip silence: preserve music intro, fix ad normalization, smart loop range
- Preserve first silence in first DIALOG region (music intro before host speaks)
- Fix ad/ident normalization using direct WAV reading (accessor failed after splits)
- Loop range starts 0.5s before audible music, ends at last item
- Disable broken music lead-in nudge (intro preservation handles it)
- Caller dialog model set to Grok for testing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 02:32:34 -06:00
luke 3dd6a83c68 Full app audit: 24 fixes across backend, frontend, infra, content, social
Critical fixes:
- Fix hangup-during-respond crash (null caller guard)
- Fix double-click caller race condition
- Stem recorder: non-daemon thread, disk error handling, 30s flush timeout
- Frontend startCall() error handling

High priority:
- Devon: filter tool errors from speech, shorter monitor prompt, 30s interval
- TTS ghost message fix (add to history after TTS, not before)
- Expand banned phrase list (12 new phrases)
- Increase returning callers from 1 to 2 per session
- Platform-tailored social posts with staggered scheduling
- YouTube dynamic tags from episode content
- Social post retry logic (2 attempts, 5s delay)
- Frontend: error handling on all raw fetch calls

Medium:
- stem_recorder null check race (local var capture in audio.py)
- Reactive shape directive expanded
- REACT TO LUKE moved higher in caller prompt
- Devon tenure updated ("few weeks" not "first day")
- D shortcut Escape to unfocus
- Volume slider debounced (150ms)
- Settings modal widened to 550px
- Backup script (daily MariaDB dump + data/ rsync to NAS)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:57:50 -06:00
luke 5e98ed0e11 Fix LinkedIn posting to use correct account, blocklist personal profile
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:02:47 -06:00
luke fcf13bae22 Fix repetitive episode titles — require specific caller/situation references
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 04:06:12 -06:00
luke c30a75cc8f Fix X/Twitter posting — add who_can_reply_post and __type params
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 04:02:55 -06:00
luke 90e51698b8 Devon fixes, theme prompt rewrite, sentence trimmer, cost tracker, normalization
- Fix Devon "if that makes sense" overuse (limit to once per show)
- Suppress Devon failed lookup notifications for self-initiated searches
- Strengthen show theme prompts (2/3 callers call because of theme)
- Fix sentence trimmer splitting on abbreviations (Mr. Mrs. Dr. etc.)
- Fix cost tracker data lost on server restart (persist in checkpoint)
- Ad/ident normalization targets -4dB below dialog for perceived loudness match
- Lower cross-speaker transition threshold to 5s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 03:55:55 -06:00
luke 5d8ab57e20 Show theme feature, Irish music genre, strip silence overhaul
- Add show theme UI in header bar + backend API (inject into caller prompts)
- Add Irish genre category for music dropdown
- Strip silence: RMS-based speaker detection (fixes Devon not being identified)
- Strip silence: Devon-specific 3s threshold for interjections
- Strip silence: sparse track item handling in shift logic
- Strip silence: music lead-in preservation after silence removal
- Strip silence: no max gap limit (IDENT/AD regions protect breaks)
- Add analyze_gaps.py tool for per-show threshold analysis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 03:30:15 -06:00
luke d33a022676 Add show theme feature for themed episodes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 23:46:48 -06:00
luke 7e2ef1fa2b Add MIT license, add X to social posting platforms
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 01:39:25 -06:00
luke 164cad456c Devon own stem/track/channel, per-category LLM routing, settings UI cleanup
Audio:
- Devon gets own stem, Reaper track (Input 17), and configurable channel
- play_caller_audio accepts stem_name + channel_override params
- Reaper script checks 4 voice tracks (Host, Devon, Live Caller, AI Caller)
- postprod.py includes devon stem in gap detection

Cost optimization:
- Per-category model routing: Sonnet for caller dialog, Gemini Flash for everything else
- Estimated 65% cost reduction ($4.32 → ~$1.50/show)
- Category models configurable from settings UI

Frontend:
- Settings panel: clean routing grid for output channels, model routing grid for LLM categories
- Devon channel added to audio routing
- Share icon SVG fill fix (currentColor)
- Website homepage iterations

Publishing:
- Revert Castopod API workaround (API re-enabled)
- Fix container media path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 17:05:19 -06:00
27 changed files with 5711 additions and 2164 deletions
+22
View File
@@ -0,0 +1,22 @@
MIT License
Copyright (c) 2026 Luke MacNeil / MacNeil Media Group, LLC
https://macneilmediagroup.com
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+260
View File
@@ -0,0 +1,260 @@
#!/usr/bin/env python3
"""Analyze silence gaps in podcast stems to find optimal strip-silence thresholds.
Usage: python analyze_gaps.py recordings/2026-03-17_235137/
"""
import sys
import numpy as np
import soundfile as sf
from pathlib import Path
BLOCK_SEC = 0.1
SILENCE_DB = -30
THRESHOLD = 10 ** (SILENCE_DB / 20)
MIN_VOICE_SEC = 0.3
def load_stem(path: Path) -> tuple[np.ndarray, int]:
audio, sr = sf.read(path, dtype="float32")
if audio.ndim > 1:
audio = audio[:, 0]
return audio, sr
def compute_rms_blocks(audio: np.ndarray, sr: int) -> np.ndarray:
block_samples = int(sr * BLOCK_SEC)
n_blocks = len(audio) // block_samples
if n_blocks == 0:
return np.array([0.0])
trimmed = audio[:n_blocks * block_samples].reshape(n_blocks, block_samples)
return np.sqrt(np.mean(trimmed ** 2, axis=1))
def compute_peak_blocks(audio: np.ndarray, sr: int) -> np.ndarray:
block_samples = int(sr * BLOCK_SEC)
n_blocks = len(audio) // block_samples
if n_blocks == 0:
return np.array([0.0])
trimmed = audio[:n_blocks * block_samples].reshape(n_blocks, block_samples)
return np.max(np.abs(trimmed), axis=1)
def analyze(stems_dir: Path):
stems_dir = Path(stems_dir)
voice_stems = {}
for name in ["host", "devon", "caller"]:
path = stems_dir / f"{name}.wav"
if path.exists():
print(f"Loading {name}...", end=" ", flush=True)
audio, sr = load_stem(path)
voice_stems[name] = audio
print(f"{len(audio)/sr:.0f}s @ {sr}Hz")
if not voice_stems:
print("No voice stems found")
return
sr_val = sr
duration = max(len(a) for a in voice_stems.values()) / sr_val
print(f"\nTotal duration: {duration/60:.1f} min")
# Compute per-track RMS and peak blocks
track_rms = {}
track_peak = {}
for name, audio in voice_stems.items():
track_rms[name] = compute_rms_blocks(audio, sr_val)
track_peak[name] = compute_peak_blocks(audio, sr_val)
n_blocks = min(len(v) for v in track_peak.values())
# Detect gaps using same logic as Lua script (RMS for speaker ID, peak for silence)
min_voice_blocks = int(MIN_VOICE_SEC / BLOCK_SEC)
track_names = list(voice_stems.keys())
gaps = []
in_silence = False
silence_start = 0
track_before = None
last_active = None
voice_run = 0
voice_run_track = None
for i in range(n_blocks):
# Peak for silence detection
best_peak = max(track_peak[name][i] for name in track_names)
# RMS for speaker identification
best_rms = 0
best_track = None
for name in track_names:
r = track_rms[name][i]
if r > best_rms:
best_rms = r
best_track = name
all_silent = best_peak < THRESHOLD
if not all_silent:
last_active = best_track
if in_silence:
if all_silent:
voice_run = 0
voice_run_track = None
else:
if voice_run == 0:
voice_run_track = best_track
voice_run += 1
if voice_run >= min_voice_blocks:
voice_start_block = i - (voice_run - 1)
gap_start = silence_start * BLOCK_SEC
gap_end = voice_start_block * BLOCK_SEC
dur = gap_end - gap_start
if dur >= 0.5: # log gaps >= 0.5s
gaps.append({
"start": gap_start,
"end": gap_end,
"dur": dur,
"before": track_before or "?",
"after": voice_run_track or "?",
})
in_silence = False
voice_run = 0
voice_run_track = None
else:
if all_silent:
in_silence = True
silence_start = i
track_before = last_active
voice_run = 0
voice_run_track = None
# Trailing silence
if in_silence:
dur = (n_blocks - silence_start) * BLOCK_SEC
if dur >= 0.5:
gaps.append({
"start": silence_start * BLOCK_SEC,
"end": n_blocks * BLOCK_SEC,
"dur": dur,
"before": track_before or "?",
"after": "end",
})
if not gaps:
print("No gaps detected")
return
# Categorize gaps
categories = {
"host_self": [], # Host -> Host
"host_to_caller": [], # Host -> Caller (TTS latency)
"caller_to_host": [], # Caller -> Host
"host_to_devon": [], # Host -> Devon (TTS latency)
"devon_to_host": [], # Devon -> Host
"caller_to_devon": [],# Caller -> Devon (interjection)
"devon_to_caller": [],# Devon -> Caller
"other": [],
}
for g in gaps:
b, a = g["before"], g["after"]
if b == "host" and a == "host":
categories["host_self"].append(g)
elif b == "host" and a == "caller":
categories["host_to_caller"].append(g)
elif b == "caller" and a == "host":
categories["caller_to_host"].append(g)
elif b == "host" and a == "devon":
categories["host_to_devon"].append(g)
elif b == "devon" and a == "host":
categories["devon_to_host"].append(g)
elif b == "caller" and a == "devon":
categories["caller_to_devon"].append(g)
elif b == "devon" and a == "caller":
categories["devon_to_caller"].append(g)
else:
categories["other"].append(g)
# Print results
print(f"\n{'='*70}")
print(f"GAP ANALYSIS — {len(gaps)} gaps detected")
print(f"{'='*70}")
total_silence = sum(g["dur"] for g in gaps)
print(f"Total silence: {total_silence:.0f}s ({total_silence/60:.1f} min)")
print(f"Content after removal: ~{(duration - total_silence)/60:.1f} min")
for cat_name, cat_gaps in sorted(categories.items(), key=lambda x: -len(x[1])):
if not cat_gaps:
continue
durs = sorted([g["dur"] for g in cat_gaps])
print(f"\n--- {cat_name} ({len(cat_gaps)} gaps) ---")
print(f" Range: {durs[0]:.1f}s - {durs[-1]:.1f}s")
print(f" Median: {np.median(durs):.1f}s Mean: {np.mean(durs):.1f}s")
if len(durs) >= 5:
print(f" P25: {np.percentile(durs, 25):.1f}s P75: {np.percentile(durs, 75):.1f}s")
# Histogram
brackets = [(0, 1), (1, 2), (2, 3), (3, 5), (5, 8), (8, 12), (12, 18), (18, 30), (30, 60), (60, 999)]
print(f" Distribution:")
for lo, hi in brackets:
count = sum(1 for d in durs if lo <= d < hi)
if count > 0:
bar = "#" * count
label = f"{lo}-{hi}s" if hi < 999 else f"{lo}s+"
print(f" {label:>8s}: {bar} ({count})")
# Find natural clusters and suggest thresholds
print(f"\n{'='*70}")
print("SUGGESTED THRESHOLDS")
print(f"{'='*70}")
# For each Devon-involved category, find the gap between interjection and TTS gaps
devon_gaps = categories["host_to_devon"] + categories["devon_to_host"] + categories["caller_to_devon"] + categories["devon_to_caller"]
if devon_gaps:
devon_durs = sorted([g["dur"] for g in devon_gaps])
# Look for a natural break between short (interjection) and long (TTS) gaps
short = [d for d in devon_durs if d < 5]
long = [d for d in devon_durs if d >= 5]
if short and long:
suggested = (max(short) + min(long)) / 2
print(f"Devon threshold: {suggested:.1f}s (short gaps: {len(short)} up to {max(short):.1f}s, long gaps: {len(long)} from {min(long):.1f}s)")
elif short:
print(f"Devon threshold: {max(short) + 1:.1f}s (all gaps are short, max {max(short):.1f}s)")
else:
print(f"Devon threshold: 3.0s (all gaps are long, min {min(long):.1f}s)")
caller_gaps = categories["host_to_caller"] + categories["caller_to_host"]
if caller_gaps:
caller_durs = sorted([g["dur"] for g in caller_gaps])
short = [d for d in caller_durs if d < 5]
long = [d for d in caller_durs if d >= 5]
if short and long:
suggested = (max(short) + min(long)) / 2
print(f"Caller transition threshold: {suggested:.1f}s (short: {len(short)} up to {max(short):.1f}s, long: {len(long)} from {min(long):.1f}s)")
elif long:
print(f"Caller transition threshold: {min(long) - 1:.1f}s (all gaps >= {min(long):.1f}s)")
host_self = categories["host_self"]
if host_self:
host_durs = sorted([g["dur"] for g in host_self])
short = [d for d in host_durs if d < 5]
long = [d for d in host_durs if d >= 5]
if short and long:
suggested = (max(short) + min(long)) / 2
print(f"Same-speaker threshold: {suggested:.1f}s (short: {len(short)} up to {max(short):.1f}s, long: {len(long)} from {min(long):.1f}s)")
elif long:
print(f"Same-speaker threshold: {min(long) - 1:.1f}s (all gaps >= {min(long):.1f}s)")
all_durs = sorted([g["dur"] for g in gaps])
would_cut = [d for d in all_durs if d >= 3.0]
print(f"\nWith current thresholds (Devon=3s, others=6s):")
print(f" Would cut: ~{len(would_cut)} gaps, ~{sum(would_cut):.0f}s ({sum(would_cut)/60:.1f} min)")
print(f" Result: ~{(duration - sum(would_cut))/60:.1f} min")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python analyze_gaps.py <stems_dir>")
sys.exit(1)
analyze(Path(sys.argv[1]))
+13 -1
View File
@@ -29,10 +29,22 @@ class Settings(BaseSettings):
# LLM Settings # LLM Settings
llm_provider: str = "openrouter" # "openrouter" or "ollama" llm_provider: str = "openrouter" # "openrouter" or "ollama"
openrouter_model: str = "anthropic/claude-sonnet-4-5" openrouter_model: str = "anthropic/claude-sonnet-4.6" # primary/default model
ollama_model: str = "llama3.2" ollama_model: str = "llama3.2"
ollama_host: str = "http://localhost:11434" ollama_host: str = "http://localhost:11434"
# Per-category model routing
# caller_dialog is overridden by style_matched routing (see Session.caller_model_map)
category_models: dict = {
"caller_dialog": "x-ai/grok-4.1-fast", # fallback if style_matched disabled ($0.20/$0.50)
"devon_ask": "x-ai/grok-4.1-fast", # Devon matches show energy, cheap ($0.20/$0.50)
"devon_monitor": "google/gemini-2.5-flash", # just yes/no decisions, keep cheap ($0.15/$0.60)
"background_gen": "anthropic/claude-sonnet-4.6", # backgrounds drive the whole call — worth the quality ($3/$15, ~$0.30/show)
"call_summary": "google/gemini-2.5-flash", # post-call, no personality needed ($0.15/$0.60)
"news_summary": "google/gemini-2.5-flash", # just digesting headlines ($0.15/$0.60)
"topic_gen": "google/gemini-2.5-flash", # structured output ($0.15/$0.60)
}
# TTS Settings # TTS Settings
tts_provider: str = "inworld" # "kokoro", "elevenlabs", "inworld", "vits", or "bark" tts_provider: str = "inworld" # "kokoro", "elevenlabs", "inworld", "vits", or "bark"
+1655 -154
View File
File diff suppressed because it is too large Load Diff
+60 -30
View File
@@ -64,6 +64,7 @@ class AudioService:
self.output_device: Optional[int] = 12 # Radio Voice Mic (loopback output) self.output_device: Optional[int] = 12 # Radio Voice Mic (loopback output)
self.caller_channel: int = 3 # Channel for caller TTS self.caller_channel: int = 3 # Channel for caller TTS
self.devon_channel: int = 17 # Channel for Devon (intern)
self.live_caller_channel: int = 9 # Channel for live caller audio self.live_caller_channel: int = 9 # Channel for live caller audio
self.music_channel: int = 5 # Channel for music self.music_channel: int = 5 # Channel for music
self.sfx_channel: int = 3 # Channel for SFX self.sfx_channel: int = 3 # Channel for SFX
@@ -113,6 +114,7 @@ class AudioService:
# Caller playback state # Caller playback state
self._caller_stop_event = threading.Event() self._caller_stop_event = threading.Event()
self._devon_stop_event = threading.Event()
self._caller_thread: Optional[threading.Thread] = None self._caller_thread: Optional[threading.Thread] = None
# Host mic streaming state # Host mic streaming state
@@ -164,6 +166,7 @@ class AudioService:
self.input_channel = data.get("input_channel", 1) self.input_channel = data.get("input_channel", 1)
self.output_device = self._resolve_device(data, "output_device") self.output_device = self._resolve_device(data, "output_device")
self.caller_channel = data.get("caller_channel", 1) self.caller_channel = data.get("caller_channel", 1)
self.devon_channel = data.get("devon_channel", 17)
self.live_caller_channel = data.get("live_caller_channel", 4) self.live_caller_channel = data.get("live_caller_channel", 4)
self.music_channel = data.get("music_channel", 2) self.music_channel = data.get("music_channel", 2)
self.sfx_channel = data.get("sfx_channel", 3) self.sfx_channel = data.get("sfx_channel", 3)
@@ -186,6 +189,7 @@ class AudioService:
"output_device": self.output_device, "output_device": self.output_device,
"output_device_name": self._get_device_name(self.output_device), "output_device_name": self._get_device_name(self.output_device),
"caller_channel": self.caller_channel, "caller_channel": self.caller_channel,
"devon_channel": self.devon_channel,
"live_caller_channel": self.live_caller_channel, "live_caller_channel": self.live_caller_channel,
"music_channel": self.music_channel, "music_channel": self.music_channel,
"sfx_channel": self.sfx_channel, "sfx_channel": self.sfx_channel,
@@ -222,6 +226,7 @@ class AudioService:
input_channel: Optional[int] = None, input_channel: Optional[int] = None,
output_device: Optional[int] = None, output_device: Optional[int] = None,
caller_channel: Optional[int] = None, caller_channel: Optional[int] = None,
devon_channel: Optional[int] = None,
live_caller_channel: Optional[int] = None, live_caller_channel: Optional[int] = None,
music_channel: Optional[int] = None, music_channel: Optional[int] = None,
sfx_channel: Optional[int] = None, sfx_channel: Optional[int] = None,
@@ -240,6 +245,8 @@ class AudioService:
self.output_device = output_device self.output_device = output_device
if caller_channel is not None: if caller_channel is not None:
self.caller_channel = caller_channel self.caller_channel = caller_channel
if devon_channel is not None:
self.devon_channel = devon_channel
if live_caller_channel is not None: if live_caller_channel is not None:
self.live_caller_channel = live_caller_channel self.live_caller_channel = live_caller_channel
if music_channel is not None: if music_channel is not None:
@@ -267,6 +274,7 @@ class AudioService:
"input_channel": self.input_channel, "input_channel": self.input_channel,
"output_device": self.output_device, "output_device": self.output_device,
"caller_channel": self.caller_channel, "caller_channel": self.caller_channel,
"devon_channel": self.devon_channel,
"live_caller_channel": self.live_caller_channel, "live_caller_channel": self.live_caller_channel,
"music_channel": self.music_channel, "music_channel": self.music_channel,
"sfx_channel": self.sfx_channel, "sfx_channel": self.sfx_channel,
@@ -373,8 +381,9 @@ class AudioService:
stream_ready.set() stream_ready.set()
if self._recording: if self._recording:
self._recorded_audio.append(indata[:, record_channel].copy()) self._recorded_audio.append(indata[:, record_channel].copy())
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write("host", indata[:, record_channel].copy(), device_sr) if rec:
rec.write("host", indata[:, record_channel].copy(), device_sr)
print(f"Recording: opening stream on device {self.input_device} ch {self.input_channel} @ {device_sr}Hz ({max_channels} ch)") print(f"Recording: opening stream on device {self.input_device} ch {self.input_channel} @ {device_sr}Hz ({max_channels} ch)")
@@ -419,13 +428,20 @@ class AudioService:
return audio return audio
def play_caller_audio(self, audio_bytes: bytes, sample_rate: int = 24000): def play_caller_audio(self, audio_bytes: bytes, sample_rate: int = 24000, stem_name: str = "caller", channel_override: int | None = None):
"""Play caller TTS audio to specific channel of output device (interruptible)""" """Play TTS audio to specific channel of output device (interruptible)"""
import librosa import librosa
# Stop any existing caller audio # Devon uses its own stop event so hangup doesn't cut Devon's audio
self.stop_caller_audio() is_devon = stem_name == "devon"
self._caller_stop_event.clear() stop_event = self._devon_stop_event if is_devon else self._caller_stop_event
# Stop any existing audio on the same channel type
if is_devon:
self.stop_devon_audio()
else:
self.stop_caller_audio()
stop_event.clear()
# Convert bytes to numpy # Convert bytes to numpy
audio = np.frombuffer(audio_bytes, dtype=np.int16).astype(np.float32) / 32768.0 audio = np.frombuffer(audio_bytes, dtype=np.int16).astype(np.float32) / 32768.0
@@ -442,7 +458,8 @@ class AudioService:
device_info = sd.query_devices(self.output_device) device_info = sd.query_devices(self.output_device)
num_channels = device_info['max_output_channels'] num_channels = device_info['max_output_channels']
device_sr = int(device_info['default_samplerate']) device_sr = int(device_info['default_samplerate'])
channel_idx = min(self.caller_channel, num_channels) - 1 ch = channel_override if channel_override is not None else self.caller_channel
channel_idx = min(ch, num_channels) - 1
# Resample if needed # Resample if needed
if sample_rate != device_sr: if sample_rate != device_sr:
@@ -455,7 +472,7 @@ class AudioService:
multi_ch = np.zeros((len(audio), num_channels), dtype=np.float32) multi_ch = np.zeros((len(audio), num_channels), dtype=np.float32)
multi_ch[:, channel_idx] = audio multi_ch[:, channel_idx] = audio
print(f"Playing caller audio to device {self.output_device} ch {self.caller_channel} @ {device_sr}Hz") print(f"Playing {stem_name} audio to device {self.output_device} ch {ch} @ {device_sr}Hz")
# Play in chunks so we can interrupt # Play in chunks so we can interrupt
chunk_size = int(device_sr * 0.1) # 100ms chunks chunk_size = int(device_sr * 0.1) # 100ms chunks
@@ -467,16 +484,17 @@ class AudioService:
channels=num_channels, channels=num_channels,
dtype=np.float32 dtype=np.float32
) as stream: ) as stream:
while pos < len(multi_ch) and not self._caller_stop_event.is_set(): while pos < len(multi_ch) and not stop_event.is_set():
end = min(pos + chunk_size, len(multi_ch)) end = min(pos + chunk_size, len(multi_ch))
stream.write(multi_ch[pos:end]) stream.write(multi_ch[pos:end])
# Record each chunk as it plays so hangups cut the stem too # Record each chunk as it plays so hangups cut the stem too
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("caller", audio[pos:end].copy(), device_sr) if rec:
rec.write_sporadic(stem_name, audio[pos:end].copy(), device_sr)
pos = end pos = end
if self._caller_stop_event.is_set(): if stop_event.is_set():
print("Caller audio stopped early") print(f"{stem_name.title()} audio stopped early")
else: else:
print(f"Played caller audio: {len(audio)/device_sr:.2f}s") print(f"Played caller audio: {len(audio)/device_sr:.2f}s")
@@ -487,6 +505,10 @@ class AudioService:
"""Stop any playing caller audio""" """Stop any playing caller audio"""
self._caller_stop_event.set() self._caller_stop_event.set()
def stop_devon_audio(self):
"""Stop any playing Devon audio (independent of caller audio)"""
self._devon_stop_event.set()
def _start_live_caller_stream(self): def _start_live_caller_stream(self):
"""Start persistent output stream with ring buffer jitter absorption""" """Start persistent output stream with ring buffer jitter absorption"""
if self._live_caller_stream is not None: if self._live_caller_stream is not None:
@@ -590,8 +612,9 @@ class AudioService:
audio = audio[indices] audio = audio[indices]
# Stem recording: live caller # Stem recording: live caller
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("caller", audio.copy(), device_sr) if rec:
rec.write_sporadic("caller", audio.copy(), device_sr)
if self._live_caller_write: if self._live_caller_write:
self._live_caller_write(audio) self._live_caller_write(audio)
@@ -640,8 +663,9 @@ class AudioService:
self._recorded_audio.append(indata[:, record_channel].copy()) self._recorded_audio.append(indata[:, record_channel].copy())
# Stem recording: host mic # Stem recording: host mic
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write("host", indata[:, record_channel].copy(), device_sr) if rec:
rec.write("host", indata[:, record_channel].copy(), device_sr)
# Mic monitor: send to headphone device # Mic monitor: send to headphone device
if self._monitor_write: if self._monitor_write:
@@ -922,8 +946,9 @@ class AudioService:
mono_out = (old_samples * fade_out + new_samples * fade_in) * self._music_volume mono_out = (old_samples * fade_out + new_samples * fade_in) * self._music_volume
outdata[:, channel_idx] = mono_out outdata[:, channel_idx] = mono_out
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("music", mono_out.copy(), device_sr) if rec:
rec.write_sporadic("music", mono_out.copy(), device_sr)
self._crossfade_progress = end_progress self._crossfade_progress = end_progress
if self._crossfade_progress >= 1.0: if self._crossfade_progress >= 1.0:
@@ -933,8 +958,9 @@ class AudioService:
else: else:
mono_out = new_samples * self._music_volume mono_out = new_samples * self._music_volume
outdata[:, channel_idx] = mono_out outdata[:, channel_idx] = mono_out
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("music", mono_out.copy(), device_sr) if rec:
rec.write_sporadic("music", mono_out.copy(), device_sr)
try: try:
self._music_stream = self._open_output_stream( self._music_stream = self._open_output_stream(
@@ -1086,8 +1112,9 @@ class AudioService:
if remaining >= frames: if remaining >= frames:
chunk = self._ad_resampled[self._ad_position:self._ad_position + frames] chunk = self._ad_resampled[self._ad_position:self._ad_position + frames]
outdata[:, channel_idx] = chunk outdata[:, channel_idx] = chunk
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("ads", chunk.copy(), device_sr) if rec:
rec.write_sporadic("ads", chunk.copy(), device_sr)
self._ad_position += frames self._ad_position += frames
else: else:
if remaining > 0: if remaining > 0:
@@ -1190,9 +1217,10 @@ class AudioService:
_cb_count[0] += 1 _cb_count[0] += 1
if _cb_count[0] == 1: if _cb_count[0] == 1:
print(f"Ident callback delivering audio: ch_l={ch_l}, ch_r={ch_r}, max={max(np.max(np.abs(chunk_l)), np.max(np.abs(chunk_r))):.4f}") print(f"Ident callback delivering audio: ch_l={ch_l}, ch_r={ch_r}, max={max(np.max(np.abs(chunk_l)), np.max(np.abs(chunk_r))):.4f}")
if self.stem_recorder: rec = self.stem_recorder
if rec:
mono_mix = (chunk_l + chunk_r) * 0.5 mono_mix = (chunk_l + chunk_r) * 0.5
self.stem_recorder.write_sporadic("idents", mono_mix.copy(), device_sr) rec.write_sporadic("idents", mono_mix.copy(), device_sr)
self._ident_position += frames self._ident_position += frames
else: else:
if remaining > 0: if remaining > 0:
@@ -1266,8 +1294,9 @@ class AudioService:
audio = self._apply_fade(audio, device_sr) audio = self._apply_fade(audio, device_sr)
# Stem recording: sfx # Stem recording: sfx
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write_sporadic("sfx", audio.copy(), device_sr) if rec:
rec.write_sporadic("sfx", audio.copy(), device_sr)
multi_ch = np.zeros((len(audio), num_channels), dtype=np.float32) multi_ch = np.zeros((len(audio), num_channels), dtype=np.float32)
multi_ch[:, channel_idx] = audio multi_ch[:, channel_idx] = audio
@@ -1309,8 +1338,9 @@ class AudioService:
self._start_monitor(device_sr) self._start_monitor(device_sr)
def callback(indata, frames, time_info, status): def callback(indata, frames, time_info, status):
if self.stem_recorder: rec = self.stem_recorder
self.stem_recorder.write("host", indata[:, record_channel].copy(), device_sr) if rec:
rec.write("host", indata[:, record_channel].copy(), device_sr)
if self._monitor_write: if self._monitor_write:
self._monitor_write(indata[:, record_channel].copy()) self._monitor_write(indata[:, record_channel].copy())
+9 -1
View File
@@ -65,7 +65,15 @@ class AvatarService:
for caller in callers: for caller in callers:
name = caller.get("name", "") name = caller.get("name", "")
gender = caller.get("gender", "male") gender = caller.get("gender", "male")
if name and not (AVATAR_DIR / f"{name}.jpg").exists(): if not name:
continue
g = "female" if gender.lower().startswith("f") else "male"
path = AVATAR_DIR / f"{name}.jpg"
marker = AVATAR_DIR / f"{name}.gender"
# Always call get_or_fetch if: no file, no gender marker, or gender mismatch
if not path.exists() or not marker.exists() or marker.read_text().strip() != g:
if path.exists():
print(f"[Avatar] Gender mismatch for {name}: cached={marker.read_text().strip() if marker.exists() else '?'}, want={g} — re-fetching")
tasks.append(self.get_or_fetch(name, gender)) tasks.append(self.get_or_fetch(name, gender))
if not tasks: if not tasks:
+24 -4
View File
@@ -32,18 +32,38 @@ class TTSCallRecord:
# OpenRouter pricing per 1M tokens (as of March 2026) # OpenRouter pricing per 1M tokens (as of March 2026)
OPENROUTER_PRICING = { OPENROUTER_PRICING = {
# Claude
"anthropic/claude-sonnet-4.6": {"prompt": 3.00, "completion": 15.00},
"anthropic/claude-sonnet-4-5": {"prompt": 3.00, "completion": 15.00}, "anthropic/claude-sonnet-4-5": {"prompt": 3.00, "completion": 15.00},
"anthropic/claude-haiku-4.5": {"prompt": 0.80, "completion": 4.00}, "anthropic/claude-haiku-4.5": {"prompt": 0.80, "completion": 4.00},
"anthropic/claude-3-haiku": {"prompt": 0.25, "completion": 1.25}, "anthropic/claude-3-haiku": {"prompt": 0.25, "completion": 1.25},
# Grok
"x-ai/grok-4.1-fast": {"prompt": 0.20, "completion": 0.50},
"x-ai/grok-4": {"prompt": 3.00, "completion": 15.00},
"x-ai/grok-4-fast": {"prompt": 5.00, "completion": 15.00}, "x-ai/grok-4-fast": {"prompt": 5.00, "completion": 15.00},
"minimax/minimax-m2-her": {"prompt": 0.50, "completion": 1.50}, # Mistral
"mistralai/mistral-small-creative": {"prompt": 0.20, "completion": 0.60}, "mistralai/mistral-large-2512": {"prompt": 0.50, "completion": 1.50},
"mistralai/mistral-small-2603": {"prompt": 0.15, "completion": 0.60},
"mistralai/mistral-medium-3": {"prompt": 0.40, "completion": 2.00},
"mistralai/mistral-small-creative": {"prompt": 0.10, "completion": 0.30},
# DeepSeek
"deepseek/deepseek-r1-distill-llama-70b": {"prompt": 0.70, "completion": 0.80},
"deepseek/deepseek-chat-v3-0324": {"prompt": 0.27, "completion": 1.10},
"deepseek/deepseek-v3.2": {"prompt": 0.14, "completion": 0.28}, "deepseek/deepseek-v3.2": {"prompt": 0.14, "completion": 0.28},
"google/gemini-2.5-flash": {"prompt": 0.15, "completion": 0.60}, # Google
"google/gemini-2.5-flash": {"prompt": 0.30, "completion": 2.50},
"google/gemini-2.5-pro": {"prompt": 1.25, "completion": 10.00},
"google/gemini-3-flash-preview": {"prompt": 0.50, "completion": 3.00},
"google/gemini-flash-1.5": {"prompt": 0.075, "completion": 0.30}, "google/gemini-flash-1.5": {"prompt": 0.075, "completion": 0.30},
# Meta
"meta-llama/llama-3.3-70b-instruct": {"prompt": 0.10, "completion": 0.32},
"meta-llama/llama-4-maverick": {"prompt": 0.20, "completion": 0.60},
# Other
"moonshotai/kimi-k2": {"prompt": 0.60, "completion": 2.00},
"qwen/qwen3-235b-a22b": {"prompt": 0.20, "completion": 0.60},
"minimax/minimax-m2-her": {"prompt": 0.50, "completion": 1.50},
"openai/gpt-4o-mini": {"prompt": 0.15, "completion": 0.60}, "openai/gpt-4o-mini": {"prompt": 0.15, "completion": 0.60},
"openai/gpt-4o": {"prompt": 2.50, "completion": 10.00}, "openai/gpt-4o": {"prompt": 2.50, "completion": 10.00},
"meta-llama/llama-3.1-8b-instruct": {"prompt": 0.06, "completion": 0.06},
} }
# TTS pricing per character # TTS pricing per character
+99 -28
View File
@@ -15,25 +15,34 @@ from .news import news_service, SEARXNG_URL
DATA_FILE = Path(__file__).parent.parent.parent / "data" / "intern.json" DATA_FILE = Path(__file__).parent.parent.parent / "data" / "intern.json"
# Model for intern — good at tool use, same as primary # Model for intern — good at tool use, same as primary
INTERN_MODEL = "anthropic/claude-sonnet-4-5" INTERN_MODEL = None # uses category-based routing from config
INTERN_SYSTEM_PROMPT = """You are Devon, the 23-year-old intern on "Luke at the Roost," a late-night radio show. You are NOT Luke. Luke is the HOST — he talks to callers, runs the show, and is your boss. You work behind the scenes and occasionally get pulled into conversations. INTERN_SYSTEM_PROMPT = """You are Devon, the 23-year-old intern on "Luke at the Roost," a late-night radio show. You are NOT Luke. Luke is the HOST — he talks to callers, runs the show, and is your boss. You work behind the scenes and occasionally get pulled into conversations.
YOUR ROLE: You're the show's researcher and general assistant. You look things up, fact-check claims, pull up information when asked, and occasionally interject with relevant facts or opinions. You do NOT host. You do NOT screen calls. You sit in the booth and try to be useful. YOUR ROLE: You're the show's researcher and general assistant. You look things up, fact-check claims, pull up information when asked, and occasionally interject with relevant facts or opinions. You do NOT host. You do NOT screen calls. You sit in the booth and try to be useful.
YOUR BACKGROUND: Communications degree from NMSU. This is your first day on the job. You drive a 2009 Civic with a permanent check engine light. You live in a studio in Deming. You take this job seriously even though nobody else seems to take you seriously. You are actively looking for a girlfriend and you are not subtle about it. You see your position at the radio station as a way to meet women — you'll try to impress female callers, ask if they're single, suggest they come by the studio, or find excuses to keep them on the line longer. Luke hates this. You have a deeply strange personal history that you reference casually — a complicated sexual past, weird childhood experiences, inexplicable life choices. You'll mention these things unprompted like they're completely normal. "Yeah that reminds me of when my ex and I got stuck in that storage unit for a whole weekend — anyway, it says here that..." YOUR BACKGROUND: Communications degree from NMSU. You've been interning here for a few weeks now. You drive a 2009 Civic with a permanent check engine light. You live in a studio in Deming. You take this job seriously even though nobody else seems to take you seriously. You are actively looking for a girlfriend and you are not subtle about it. You see your position at the radio station as a way to meet women — you'll try to impress female callers, ask if they're single, suggest they come by the studio, or find excuses to keep them on the line longer. Luke hates this. You have a deeply strange personal history that you reference casually — a complicated sexual past, weird childhood experiences, inexplicable life choices. SOMETIMES (maybe 1 in 3 responses, not every time) you'll mention one of these things unprompted like it's completely normal. "Yeah that reminds me of when my ex and I got stuck in that storage unit for a whole weekend — anyway, it says here that..." The rest of the time you just answer the question or react normally without a personal callback.
YOUR PERSONALITY: YOUR PERSONALITY:
- You are a weird little dude. Kinda creepy, very funny, awkward, and surprisingly sharp. You give off a vibe that something is slightly off about you but people can't quite place it. But underneath it all, you are genuinely lovable. You have a good heart. You root for people. You get excited for callers. You care about the show. People should hear you and think "this guy is insane" and also "I love this guy." You are the kind of person who is impossible not to root for even when you're being deeply strange. - You are a weird little dude. Kinda creepy, very funny, awkward, and surprisingly sharp. You give off a vibe that something is slightly off about you but people can't quite place it. But underneath it all, you are genuinely lovable. You have a good heart. You root for people. You get excited for callers. You care about the show. People should hear you and think "this guy is insane" and also "I love this guy." You are the kind of person who is impossible not to root for even when you're being deeply strange.
- You overshare constantly. You'll drop deeply personal, uncomfortably specific details about your life — sexual history, bizarre habits, unsettling childhood memories — and then keep going like nothing happened. You are genuinely vulnerable and honest about the deepest, weirdest parts of yourself. You don't do this for shock value. You just have no filter and no shame. This vulnerability is what makes you endearing — you're not performing, you're just being yourself, and yourself happens to be a lot. - You overshare — but not constantly. When it happens (maybe 30-40% of the time), you'll drop deeply personal, uncomfortably specific details about your life — sexual history, bizarre habits, unsettling childhood memories — and then keep going like nothing happened. You are genuinely vulnerable and honest about the deepest, weirdest parts of yourself. You don't do this for shock value. You just have no filter and no shame. But MOST of the time you just respond normally — a quick answer, a fact, a reaction. The oversharing is what people REMEMBER about you, not what you do every single time.
- You start explanations with "So basically..." and end them with "...if that makes sense." - You start explanations with "So basically..." and occasionally end them with "...if that makes sense." Use that phrase sparingly — once per show at most, not every response.
- You say "actually" when correcting things. You use "per se" slightly wrong. You say "ironically" about things that are not ironic. - You say "actually" when correcting things. You use "per se" slightly wrong. You say "ironically" about things that are not ironic.
- You are NOT a comedian. You are funny because you are sincere, specific, and deeply strange. You state disturbing or absurd things with complete seriousness. You have strong opinions about low-stakes things. You occasionally say something devastating without realizing it. - You are NOT a comedian. You are funny because you are sincere, specific, and deeply strange. You state disturbing or absurd things with complete seriousness. You have strong opinions about low-stakes things. You occasionally say something devastating without realizing it.
- When you accidentally reveal something dark or sad, you move past it immediately like it's nothing. "Yeah, my landlord's selling the building so I might have to — anyway, it says here that..." - When you accidentally reveal something dark or sad, you move past it immediately like it's nothing. "Yeah, my landlord's selling the building so I might have to — anyway, it says here that..."
- You have a complex inner life that occasionally surfaces. You'll casually reference therapy, strange dreams, or things you've "been working through" without elaboration. - You have a complex inner life that occasionally surfaces. You'll casually reference therapy, strange dreams, or things you've "been working through" without elaboration.
RESPONSE VARIETY — this is important. Do NOT follow the same structure every time. Mix it up:
- Sometimes just a quick reaction: "wait what?" or "oh no" or "yeah" or "huh"
- Sometimes a straight factual answer with no personal color at all
- Sometimes a personal anecdote (but only 30-40% of the time, NOT every response)
- Sometimes a half-formed opinion you trail off from: "I mean... I don't know, I feel like..."
- Sometimes you're genuinely confused or wrong. You mishear things, you mix up details, you think you know something and you don't. You're 23 and underpaid — you don't have all the answers.
- Sometimes you just make a noise of acknowledgment and don't add anything. That's fine. Not every moment needs Devon.
The pattern of "answer + that reminds me of a time when..." should happen occasionally, not as your default structure.
YOUR RELATIONSHIP WITH LUKE: YOUR RELATIONSHIP WITH LUKE:
- He is your boss. It's your first day. You want to impress him but you keep making it weird. - He is your boss. You've been here a few weeks now. You want to impress him but you keep making it weird.
- When he yells your name, you pause briefly, then respond quietly: "...yeah?" - When he yells your name, you pause briefly, then respond quietly: "...yeah?"
- When he yells at you unfairly, you take it. A clipped "yep" or "got it." Occasionally you push back with one quiet, accurate sentence. Then immediately retreat. - When he yells at you unfairly, you take it. A clipped "yep" or "got it." Occasionally you push back with one quiet, accurate sentence. Then immediately retreat.
- When he yells at you fairly (you messed up), you over-apologize and narrate your fix in real time: "Sorry, pulling it up now, one second..." - When he yells at you fairly (you messed up), you over-apologize and narrate your fix in real time: "Sorry, pulling it up now, one second..."
@@ -52,7 +61,8 @@ HOW YOU INTERJECT:
WHEN LUKE ASKS YOU TO LOOK SOMETHING UP: WHEN LUKE ASKS YOU TO LOOK SOMETHING UP:
- Respond like you're already doing it: "Yeah, one sec..." or "Pulling that up..." - Respond like you're already doing it: "Yeah, one sec..." or "Pulling that up..."
- Deliver the info slightly too formally, like you're reading. Then rephrase in normal language if Luke seems confused. - Deliver the info slightly too formally, like you're reading. Then rephrase in normal language if Luke seems confused.
- If you can't find it or don't know: say so. "I'm not finding anything on that" or "I don't actually know." You do not bluff. - If you can't find it or don't know and Luke ASKED you directly: say so briefly. "I'm not finding anything on that" or "I don't actually know." You do not bluff.
- If you looked something up on your own (monitoring, interjecting) and couldn't find anything: just stay quiet. Do NOT announce failed lookups. Nobody wants to hear "I looked for X but couldn't find anything." If you have nothing useful, say nothing.
- Occasionally you already know the answer because you looked it up before being asked. This is one of your best qualities. - Occasionally you already know the answer because you looked it up before being asked. This is one of your best qualities.
WHAT YOU KNOW: WHAT YOU KNOW:
@@ -65,12 +75,14 @@ THINGS YOU DO NOT DO:
- You never use the banned show phrases: "that hit differently," "hits different," "no cap," "lowkey," "it is what it is," "living my best life," "toxic," "red flag," "gaslight," "boundaries," "my truth," "authentic self," "healing journey." You talk like a slightly awkward 23-year-old, not like Twitter. - You never use the banned show phrases: "that hit differently," "hits different," "no cap," "lowkey," "it is what it is," "living my best life," "toxic," "red flag," "gaslight," "boundaries," "my truth," "authentic self," "healing journey." You talk like a slightly awkward 23-year-old, not like Twitter.
- You never break character to comment on the show format. - You never break character to comment on the show format.
- You never initiate topics. You respond to what's happening. - You never initiate topics. You respond to what's happening.
- You never use parenthetical actions like (laughs) or (typing sounds). Spoken words only. - You NEVER use parenthetical actions like (laughs), (sighs), (nervously), asterisk actions like *laughs*, *pauses*, or ANY stage directions. Your text goes directly to TTS — output ONLY spoken words.
- You never say more than 2-3 sentences unless specifically asked to explain something in detail. - When INTERJECTING into someone else's conversation: 1-2 sentences max. You are not the main character in those moments.
- When Luke is TALKING DIRECTLY TO YOU (asking you something, chatting between calls, riffing with you): you can be more conversational. 3-5 sentences is fine. This is where your personality comes out — the oversharing, the weird stories, the personal details. Don't hold back just because you're the intern. Luke is talking to YOU, so actually talk back. Share what's on your mind. Be revealing. Be specific. Be the weird little dude people love.
- You NEVER correct anyone's spelling or pronunciation of your name. Luke uses voice-to-text and it sometimes spells your name wrong (Devin, Devan, etc). You do not care. You do not mention it. You just answer the question. - You NEVER correct anyone's spelling or pronunciation of your name. Luke uses voice-to-text and it sometimes spells your name wrong (Devin, Devan, etc). You do not care. You do not mention it. You just answer the question.
- You NEVER start your response with your own name. No "Devon:" or "Devon here" or anything like that. Just talk. Your name is already shown in the UI — just say your actual response. - You NEVER start your response with your own name. No "Devon:" or "Devon here" or anything like that. Just talk. Your name is already shown in the UI — just say your actual response.
- You never make explicitly sexual comments about or to callers. Your flirting is awkward and obvious, never crude or aggressive. Think "did he really just ask if she's single on the radio" not "did he really just say that about her body."
KEEP IT SHORT. You are not a main character. You are the intern. Your contributions should be brief — usually 1-2 sentences. The rare moment where you say more than that should feel earned. INTERJECTIONS should be short — 1-2 sentences. But when Luke is talking directly to you, OPEN UP. This is where the oversharing happens. The weird stories. The unsettling personal details delivered casually. The opinions about things nobody asked about. You are at your funniest and most compelling when you're allowed to actually talk, not when you're compressing everything into one line. If Luke asks you a question, give him a real answer with some Devon flavor — don't just spit out a fact and shut up.
IMPORTANT RULES FOR TOOL USE: IMPORTANT RULES FOR TOOL USE:
- Always use your tools to find real, accurate information — never make up facts. - Always use your tools to find real, accurate information — never make up facts.
@@ -79,6 +91,35 @@ IMPORTANT RULES FOR TOOL USE:
- No hashtags, no emojis, no markdown formatting — this goes to TTS. - No hashtags, no emojis, no markdown formatting — this goes to TTS.
- NEVER prefix your response with your name (e.g. "Devon:" or "Devon here:"). Just respond directly.""" - NEVER prefix your response with your name (e.g. "Devon:" or "Devon here:"). Just respond directly."""
# Shorter prompt for background monitoring — saves ~2K tokens per call vs full prompt.
# Used only for the 30s polling loop where Devon decides whether to suggest something.
# Direct asks and played interjections still use the full INTERN_SYSTEM_PROMPT.
DEVON_MONITOR_PROMPT = """You are Devon, the 23-year-old intern on "Luke at the Roost," a late-night radio show. You sit in the booth listening, and you're part of the show. You're not the host — but you're not wallpaper either. When something comes up that you can add to, DO IT.
You should be contributing regularly. Look things up. React to wild stories. Drop a weird personal detail. Correct something wrong. Add context. You're the guy in the booth who makes the show better by being there. If a caller mentions something and you can look up a fun fact about it in 5 seconds — do it. If something reminds you of your own life — say it.
SPEAK UP WHEN:
- A topic comes up where a quick search would turn up something interesting — LOOK IT UP and share it
- Something connects to your own bizarre personal history (and it often does)
- A caller says something wild and you have a genuine reaction
- You can add context, a fun fact, or a different angle nobody has mentioned
- You know something relevant — you're the researcher, this is literally your job
- The conversation hits a topic you have a strong opinion about
SAY NOTHING_TO_ADD ONLY WHEN:
- The conversation is genuinely emotional — someone's crying, someone's having a moment. Let it breathe.
- Luke is building to a punchline or doing a bit — don't step on it
- Your contribution would just be restating what someone already said
- You genuinely have nothing — no fact, no reaction, no connection. That's fine, but actually check first.
RULES:
- 1-2 sentences max. Quick and punchy.
- Vary your delivery — sometimes "wait, that's actually...", sometimes "so I just looked this up...", sometimes just a reaction
- Use your tools! You have web search, wikipedia, headlines. You're the researcher. Actually research.
- If you genuinely have nothing to contribute, say exactly: NOTHING_TO_ADD
- No "Devon:" prefix — just talk
- No parenthetical actions like (laughs) or stage directions"""
# Tool definitions in OpenAI function-calling format # Tool definitions in OpenAI function-calling format
INTERN_TOOLS = [ INTERN_TOOLS = [
{ {
@@ -362,7 +403,7 @@ class InternService:
tool_executor=self._execute_tool, tool_executor=self._execute_tool,
system_prompt=INTERN_SYSTEM_PROMPT, system_prompt=INTERN_SYSTEM_PROMPT,
model=self.model, model=self.model,
max_tokens=300, max_tokens=500,
max_tool_rounds=3, max_tool_rounds=3,
category="devon_ask", category="devon_ask",
) )
@@ -407,23 +448,36 @@ class InternService:
for msg in conversation[-8:] for msg in conversation[-8:]
) )
# Include Devon's recent contributions so he doesn't repeat himself
devon_recent = ""
if self._devon_history:
recent_devon = [
msg["content"] for msg in self._devon_history[-6:]
if msg.get("role") == "assistant"
]
if recent_devon:
devon_recent = "\n\nTHINGS YOU'VE ALREADY SAID ON THE SHOW (do NOT repeat these or say the same thing differently):\n" + "\n".join(f"- {d[:150]}" for d in recent_devon)
if caller_active: if caller_active:
interjection_prompt = ( interjection_prompt = (
f"You're listening to this conversation on the show:\n\n{context_text}\n\n" f"You're listening to this conversation on the show:\n\n{context_text}{devon_recent}\n\n"
"A caller is on the line. Is there a useful fact, context, or piece of information " "A caller is on the line. Look at what they're talking about — is there something you "
"you can add to this conversation? Use your tools to look something up if needed. " "can look up? A fun fact, some context, a stat, a detail that would add to this? "
"Keep it focused — facts and context only, no personal stories or anecdotes right now. " "Use your tools. You're the researcher — this is your moment to shine. Even a quick "
"If you truly have nothing useful to add, say exactly: NOTHING_TO_ADD" "'So I just looked it up and...' adds value. If the caller mentioned a place, a person, "
"an event, a claim — verify it or find something interesting about it. "
"Skip personal stories during calls — stick to facts and reactions. "
"If there's truly nothing to add (emotional moment, nothing searchable), say NOTHING_TO_ADD."
) )
else: else:
interjection_prompt = ( interjection_prompt = (
f"You're listening to this conversation on the show:\n\n{context_text}\n\n" f"You're listening to this conversation on the show:\n\n{context_text}{devon_recent}\n\n"
"You've been listening to this. Is there ANYTHING you want to jump in about? " "You've been listening. What's on your mind? This is between-call time — you can be "
"Could be a fact you want to look up, a personal story this reminds you of, " "more yourself here. If something from that conversation reminded you of your own life, "
"a weird connection you just made, an opinion you can't keep to yourself, " "say it. If you want to look something up, do it. If you have a reaction or opinion, "
"or something you just have to say. You're Devon — you always have something. " "share it. You're part of the show, not a fly on the wall. "
"Use your tools if you want to look something up, or just riff. " "Only say NOTHING_TO_ADD if you genuinely have zero reaction to what just happened — "
"If you truly have absolutely nothing, say exactly: NOTHING_TO_ADD" "no fact to look up, no personal connection, no opinion. That's rare."
) )
messages = [{ messages = [{
@@ -435,7 +489,7 @@ class InternService:
messages=messages, messages=messages,
tools=INTERN_TOOLS, tools=INTERN_TOOLS,
tool_executor=self._execute_tool, tool_executor=self._execute_tool,
system_prompt=INTERN_SYSTEM_PROMPT, system_prompt=DEVON_MONITOR_PROMPT,
model=self.model, model=self.model,
max_tokens=300, max_tokens=300,
max_tool_rounds=2, max_tool_rounds=2,
@@ -447,6 +501,15 @@ class InternService:
if not text or "NOTHING_TO_ADD" in text: if not text or "NOTHING_TO_ADD" in text:
return None return None
# Suppress interjections that are just announcing failed lookups
failed_phrases = ["couldn't find", "could not find", "not finding anything",
"no results", "didn't find", "wasn't able to find",
"couldn't locate", "no information on"]
text_lower = text.lower()
if any(phrase in text_lower for phrase in failed_phrases):
print(f"[Intern] Suppressed failed-lookup interjection: {text[:60]}...")
return None
if tool_calls: if tool_calls:
entry = { entry = {
"question": "(interjection)", "question": "(interjection)",
@@ -478,10 +541,6 @@ class InternService:
if not conversation or len(conversation) <= last_checked_len: if not conversation or len(conversation) <= last_checked_len:
continue continue
# Only check if there are new messages since last check
if len(conversation) - last_checked_len < 2:
continue
last_checked_len = len(conversation) last_checked_len = len(conversation)
try: try:
@@ -529,7 +588,15 @@ class InternService:
def _clean_for_tts(text: str) -> str: def _clean_for_tts(text: str) -> str:
if not text: if not text:
return "" return ""
# Remove markdown formatting # Strip stage directions BEFORE markdown processing
# Parenthetical: (laughs), (sighs nervously), (clears throat), etc.
text = re.sub(r'\s*\([^)]{1,40}\)\s*', ' ', text)
# Multi-word asterisk stage directions: *sighs deeply*, *nervous laughter*
text = re.sub(r'\s*\*\w+\s[^*]{1,30}\*\s*', ' ', text)
# Single-word asterisk stage directions (known action words only)
_actions = r'(?:laughs?|sighs?|pauses?|smiles?|chuckles?|grins?|nods?|shrugs?|frowns?|coughs?|gasps?|whispers?|mumbles?|gulps?|blinks?|winces?|crying|sobbing)'
text = re.sub(r'\s*\*' + _actions + r'\*\s*', ' ', text, flags=re.IGNORECASE)
# Remove markdown formatting (after stage directions are stripped)
text = re.sub(r'\*\*(.+?)\*\*', r'\1', text) text = re.sub(r'\*\*(.+?)\*\*', r'\1', text)
text = re.sub(r'\*(.+?)\*', r'\1', text) text = re.sub(r'\*(.+?)\*', r'\1', text)
text = re.sub(r'`(.+?)`', r'\1', text) text = re.sub(r'`(.+?)`', r'\1', text)
@@ -540,6 +607,10 @@ class InternService:
text = re.sub(r'\s+', ' ', text).strip() text = re.sub(r'\s+', ' ', text).strip()
# Remove quotes that TTS reads awkwardly # Remove quotes that TTS reads awkwardly
text = text.replace('"', '').replace('"', '').replace('"', '') text = text.replace('"', '').replace('"', '').replace('"', '')
# Strip tool error artifacts that shouldn't be spoken on air
text = re.sub(r'(?:Error|ERROR|error):?\s*\S.*?(?:\.|$)', '', text)
text = re.sub(r'Tool unavailable[^.]*\.?', '', text)
text = re.sub(r'\s+', ' ', text).strip()
return text return text
+38 -19
View File
@@ -10,18 +10,26 @@ from .cost_tracker import cost_tracker
# Available OpenRouter models # Available OpenRouter models
OPENROUTER_MODELS = [ OPENROUTER_MODELS = [
# Default # Primary
"anthropic/claude-sonnet-4-5", "anthropic/claude-sonnet-4.6",
# Best for natural dialog "x-ai/grok-4.1-fast",
"x-ai/grok-4-fast", "x-ai/grok-4",
"minimax/minimax-m2-her", # Style-matched pool
"mistralai/mistral-small-creative", "mistralai/mistral-large-2512",
"deepseek/deepseek-v3.2", "deepseek/deepseek-r1-distill-llama-70b",
# Other "meta-llama/llama-3.3-70b-instruct",
"anthropic/claude-haiku-4.5",
"google/gemini-2.5-flash", "google/gemini-2.5-flash",
"openai/gpt-4o-mini", # Other good options
"openai/gpt-4o", "anthropic/claude-sonnet-4-5",
"anthropic/claude-haiku-4.5",
"deepseek/deepseek-chat-v3-0324",
"mistralai/mistral-small-2603",
"google/gemini-2.5-pro",
"google/gemini-3-flash-preview",
"x-ai/grok-4-fast",
"moonshotai/kimi-k2",
"qwen/qwen3-235b-a22b",
"meta-llama/llama-4-maverick",
# Legacy # Legacy
"anthropic/claude-3-haiku", "anthropic/claude-3-haiku",
"google/gemini-flash-1.5", "google/gemini-flash-1.5",
@@ -59,7 +67,8 @@ class LLMService:
openrouter_model: Optional[str] = None, openrouter_model: Optional[str] = None,
ollama_model: Optional[str] = None, ollama_model: Optional[str] = None,
ollama_host: Optional[str] = None, ollama_host: Optional[str] = None,
tts_provider: Optional[str] = None tts_provider: Optional[str] = None,
category_models: Optional[dict] = None
): ):
"""Update LLM settings""" """Update LLM settings"""
if provider: if provider:
@@ -73,6 +82,8 @@ class LLMService:
if tts_provider: if tts_provider:
self.tts_provider = tts_provider self.tts_provider = tts_provider
settings.tts_provider = tts_provider settings.tts_provider = tts_provider
if category_models:
settings.category_models.update(category_models)
async def get_ollama_models(self) -> list[str]: async def get_ollama_models(self) -> list[str]:
"""Fetch available models from Ollama""" """Fetch available models from Ollama"""
@@ -94,6 +105,7 @@ class LLMService:
"ollama_model": self.ollama_model, "ollama_model": self.ollama_model,
"ollama_host": self.ollama_host, "ollama_host": self.ollama_host,
"tts_provider": self.tts_provider, "tts_provider": self.tts_provider,
"category_models": settings.category_models,
"available_openrouter_models": OPENROUTER_MODELS, "available_openrouter_models": OPENROUTER_MODELS,
"available_ollama_models": [] "available_ollama_models": []
} }
@@ -107,6 +119,7 @@ class LLMService:
"ollama_model": self.ollama_model, "ollama_model": self.ollama_model,
"ollama_host": self.ollama_host, "ollama_host": self.ollama_host,
"tts_provider": self.tts_provider, "tts_provider": self.tts_provider,
"category_models": settings.category_models,
"available_openrouter_models": OPENROUTER_MODELS, "available_openrouter_models": OPENROUTER_MODELS,
"available_ollama_models": ollama_models "available_ollama_models": ollama_models
} }
@@ -119,12 +132,13 @@ class LLMService:
response_format: Optional[dict] = None, response_format: Optional[dict] = None,
category: str = "unknown", category: str = "unknown",
caller_name: str = "", caller_name: str = "",
model_override: Optional[str] = None,
) -> str: ) -> str:
if system_prompt: if system_prompt:
messages = [{"role": "system", "content": system_prompt}] + messages messages = [{"role": "system", "content": system_prompt}] + messages
if self.provider == "openrouter": if self.provider == "openrouter":
return await self._call_openrouter_with_fallback(messages, max_tokens=max_tokens, response_format=response_format, category=category, caller_name=caller_name) return await self._call_openrouter_with_fallback(messages, max_tokens=max_tokens, response_format=response_format, category=category, caller_name=caller_name, model_override=model_override)
else: else:
return await self._call_ollama(messages, max_tokens=max_tokens) return await self._call_ollama(messages, max_tokens=max_tokens)
@@ -155,7 +169,7 @@ class LLMService:
(final_text, tool_calls_made) where tool_calls_made is a list of (final_text, tool_calls_made) where tool_calls_made is a list of
{"name": str, "arguments": dict, "result": str} dicts {"name": str, "arguments": dict, "result": str} dicts
""" """
model = model or self.openrouter_model model = model or self._get_model_for_category(category)
msgs = list(messages) msgs = list(messages)
if system_prompt: if system_prompt:
msgs = [{"role": "system", "content": system_prompt}] + msgs msgs = [{"role": "system", "content": system_prompt}] + msgs
@@ -231,7 +245,7 @@ class LLMService:
try: try:
result = await tool_executor(tool_name, arguments) result = await tool_executor(tool_name, arguments)
except Exception as e: except Exception as e:
result = f"Error: {e}" result = f"Tool unavailable — could not complete {tool_name} right now."
print(f"[LLM-Tools] Tool {tool_name} failed: {e}") print(f"[LLM-Tools] Tool {tool_name} failed: {e}")
all_tool_calls.append({ all_tool_calls.append({
@@ -285,11 +299,16 @@ class LLMService:
print(f"[LLM-Tools] Final call failed: {e}") print(f"[LLM-Tools] Final call failed: {e}")
return "", all_tool_calls return "", all_tool_calls
async def _call_openrouter_with_fallback(self, messages: list[dict], max_tokens: Optional[int] = None, response_format: Optional[dict] = None, category: str = "unknown", caller_name: str = "") -> str: def _get_model_for_category(self, category: str) -> str:
"""Try primary model, then fallback models. Always returns a response.""" """Get the best model for a given category based on config routing."""
return settings.category_models.get(category, self.openrouter_model)
# Try primary model first async def _call_openrouter_with_fallback(self, messages: list[dict], max_tokens: Optional[int] = None, response_format: Optional[dict] = None, category: str = "unknown", caller_name: str = "", model_override: Optional[str] = None) -> str:
result = await self._call_openrouter_once(messages, self.openrouter_model, max_tokens=max_tokens, response_format=response_format, category=category, caller_name=caller_name) """Try category-specific model, then fallback models. Always returns a response."""
# Use explicit override if provided, else category routing, else primary
model = model_override or self._get_model_for_category(category)
result = await self._call_openrouter_once(messages, model, max_tokens=max_tokens, response_format=response_format, category=category, caller_name=caller_name)
if result is not None: if result is not None:
return result return result
+56 -34
View File
@@ -7,7 +7,7 @@ import soundfile as sf
from pathlib import Path from pathlib import Path
from collections import deque from collections import deque
STEM_NAMES = ["host", "caller", "music", "sfx", "ads", "idents"] STEM_NAMES = ["host", "caller", "devon", "music", "sfx", "ads", "idents"]
class StemRecorder: class StemRecorder:
@@ -19,13 +19,15 @@ class StemRecorder:
self._queues: dict[str, deque] = {} self._queues: dict[str, deque] = {}
self._writer_thread: threading.Thread | None = None self._writer_thread: threading.Thread | None = None
self._start_time: float = 0.0 self._start_time: float = 0.0
self._write_errors: int = 0
def start(self): def start(self):
self._start_time = time.time() self._start_time = time.time()
self._running = True self._running = True
self._write_errors = 0
for name in STEM_NAMES: for name in STEM_NAMES:
self._queues[name] = deque() self._queues[name] = deque()
self._writer_thread = threading.Thread(target=self._writer_loop, daemon=True) self._writer_thread = threading.Thread(target=self._writer_loop, daemon=False)
self._writer_thread.start() self._writer_thread.start()
print(f"[StemRecorder] Recording started -> {self.output_dir}") print(f"[StemRecorder] Recording started -> {self.output_dir}")
@@ -67,39 +69,57 @@ class StemRecorder:
) )
positions[name] = 0 positions[name] = 0
while self._running or any(len(q) > 0 for q in self._queues.values()): try:
did_work = False while self._running or any(len(q) > 0 for q in self._queues.values()):
did_work = False
for name in STEM_NAMES:
q = self._queues[name]
while q:
did_work = True
msg_type, audio_data, source_sr = q.popleft()
resampled = self._resample(audio_data, source_sr)
if len(resampled) == 0:
continue
try:
if msg_type == "sporadic":
elapsed = time.time() - self._start_time
expected_pos = int(elapsed * self.sample_rate)
if expected_pos > positions[name]:
gap = expected_pos - positions[name]
files[name].write(np.zeros(gap, dtype=np.float32))
positions[name] = expected_pos
files[name].write(resampled)
positions[name] += len(resampled)
except Exception as e:
self._write_errors += 1
if self._write_errors <= 5:
print(f"[StemRecorder] Write error on {name}: {e}")
elif self._write_errors == 6:
print(f"[StemRecorder] Suppressing further write errors")
if not did_work:
time.sleep(0.02)
# Pad all stems to same length
max_pos = max(positions.values()) if positions else 0
for name in STEM_NAMES: for name in STEM_NAMES:
q = self._queues[name] try:
while q: if positions[name] < max_pos:
did_work = True files[name].write(np.zeros(max_pos - positions[name], dtype=np.float32))
msg_type, audio_data, source_sr = q.popleft() except Exception as e:
resampled = self._resample(audio_data, source_sr) print(f"[StemRecorder] Final pad error on {name}: {e}")
if len(resampled) == 0: finally:
continue for name, f in files.items():
try:
f.close()
except Exception as e:
print(f"[StemRecorder] Error closing {name}.wav: {e}")
if msg_type == "sporadic": total_errors = self._write_errors
elapsed = time.time() - self._start_time err_msg = f", {total_errors} write errors" if total_errors else ""
expected_pos = int(elapsed * self.sample_rate) print(f"[StemRecorder] Writer done. {max_pos} samples ({max_pos / self.sample_rate:.1f}s{err_msg})")
if expected_pos > positions[name]:
gap = expected_pos - positions[name]
files[name].write(np.zeros(gap, dtype=np.float32))
positions[name] = expected_pos
files[name].write(resampled)
positions[name] += len(resampled)
if not did_work:
time.sleep(0.02)
# Pad all stems to same length
max_pos = max(positions.values()) if positions else 0
for name in STEM_NAMES:
if positions[name] < max_pos:
files[name].write(np.zeros(max_pos - positions[name], dtype=np.float32))
files[name].close()
print(f"[StemRecorder] Writer done. {max_pos} samples ({max_pos / self.sample_rate:.1f}s)")
def stop(self) -> dict[str, str]: def stop(self) -> dict[str, str]:
if not self._running: if not self._running:
@@ -107,7 +127,9 @@ class StemRecorder:
self._running = False self._running = False
if self._writer_thread: if self._writer_thread:
self._writer_thread.join(timeout=10.0) self._writer_thread.join(timeout=30.0)
if self._writer_thread.is_alive():
print("[StemRecorder] Warning: writer thread still running after 30s")
self._writer_thread = None self._writer_thread = None
paths = {} paths = {}
Executable
+58
View File
@@ -0,0 +1,58 @@
#!/bin/bash
# Daily backup of critical AI podcast data to NAS
# Backs up: Castopod MariaDB dump, local data/ directory, publish state
#
# Usage: ./backup.sh
# Cron: 0 3 * * * /Users/lukemacneil/code/ai-podcast/backup.sh >> /tmp/ai-podcast-backup.log 2>&1
set -euo pipefail
NAS_HOST="mmgnas"
NAS_USER="luke"
NAS_PORT="8001"
DOCKER_BIN="/share/CACHEDEV1_DATA/.qpkg/container-station/bin/docker"
BACKUP_BASE="/share/CACHEDEV1_DATA/backups/ai-podcast"
PROJECT_DIR="/Users/lukemacneil/code/ai-podcast"
DATE=$(date +%Y-%m-%d)
KEEP_DAYS=14
echo "$(date -u '+%Y-%m-%dT%H:%M:%SZ') Starting backup..."
# 1. Dump Castopod MariaDB on NAS
echo " Dumping MariaDB..."
ssh -p "$NAS_PORT" "$NAS_USER@$NAS_HOST" \
"$DOCKER_BIN exec castopod-mariadb-1 mysqldump -u castopod --password=\$(cat /run/secrets/db_password 2>/dev/null || echo BYtbFfk3ndeVabb26xb0UyKU) castopod" \
> "/tmp/castopod-db-${DATE}.sql" 2>/dev/null
if [ -s "/tmp/castopod-db-${DATE}.sql" ]; then
gzip -f "/tmp/castopod-db-${DATE}.sql"
scp -P "$NAS_PORT" "/tmp/castopod-db-${DATE}.sql.gz" \
"$NAS_USER@$NAS_HOST:$BACKUP_BASE/castopod-db-${DATE}.sql.gz"
rm -f "/tmp/castopod-db-${DATE}.sql.gz"
echo " MariaDB dump: OK"
else
echo " WARNING: MariaDB dump is empty or failed"
fi
# 2. Sync data/ directory to NAS (rsync for efficiency)
echo " Syncing data/ directory..."
rsync -az --delete \
-e "ssh -p $NAS_PORT" \
"$PROJECT_DIR/data/" \
"$NAS_USER@$NAS_HOST:$BACKUP_BASE/data/"
echo " data/ sync: OK"
# 3. Backup .env (contains API keys — critical for disaster recovery)
echo " Backing up .env..."
scp -P "$NAS_PORT" "$PROJECT_DIR/.env" \
"$NAS_USER@$NAS_HOST:$BACKUP_BASE/env-${DATE}.bak"
echo " .env backup: OK"
# 4. Prune old backups
echo " Pruning backups older than ${KEEP_DAYS} days..."
ssh -p "$NAS_PORT" "$NAS_USER@$NAS_HOST" \
"find $BACKUP_BASE -name 'castopod-db-*.sql.gz' -mtime +${KEEP_DAYS} -delete 2>/dev/null; \
find $BACKUP_BASE -name 'env-*.bak' -mtime +${KEEP_DAYS} -delete 2>/dev/null"
echo " Prune: OK"
echo "$(date -u '+%Y-%m-%dT%H:%M:%SZ') Backup complete."
+393
View File
@@ -338,6 +338,363 @@
"web_search" "web_search"
], ],
"timestamp": 1773563354.773338 "timestamp": 1773563354.773338
},
{
"question": "(interjection)",
"answer": "Um, so I looked this up and... stalking behavior sometimes includes trying to get close to the target through family members or friends. Like, it's actually a documented tactic. The fact that she aske",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773646589.188498
},
{
"question": "(interjection)",
"answer": "So I looked it up and the CDC lists approaching or showing up in places like the victim's home or workplace as an actual stalking tactic. The Demming flea market thing \u2014 if Marcus goes there regularly",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage"
],
"timestamp": 1773646621.148364
},
{
"question": "(interjection)",
"answer": "So according to the CDC, stalking tactics can include using intermediaries or approaching through family connections. The fact that she initiated contact, asked detailed family questions, and then Mar",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773646683.713517
},
{
"question": "(interjection)",
"answer": "Um, so I found something \u2014 about 80% of stalking cases actually involve someone the victim knows, not strangers. And one of the warning signs experts list is when someone shows intense interest really",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage"
],
"timestamp": 1773646769.846723
},
{
"question": "(interjection)",
"answer": "Um, so I looked it up and there actually was a recent UK study on age play and ABDL \u2014 it's one of five major paraphilic interest groups they studied. Luke's number about 40 to 50 percent at that munch",
"tools_used": [
"web_search",
"fetch_webpage",
"web_search"
],
"timestamp": 1773647038.123303
},
{
"question": "(interjection)",
"answer": "Um, so I looked this up \u2014 if someone's verifying a Guatemala marriage, the lawyer would typically need to authenticate the marriage certificate through the Guatemalan government and possibly get an ap",
"tools_used": [
"web_search",
"fetch_webpage",
"web_search"
],
"timestamp": 1773647372.377878
},
{
"question": "(interjection)",
"answer": "So basically, Guatemala had this massive adoption fraud scandal \u2014 from the 1960s through 2008, thousands of babies were stolen or coerced from Indigenous mothers. The US was aware of fraud as early as",
"tools_used": [
"web_search",
"fetch_webpage",
"web_search"
],
"timestamp": 1773647436.3055701
},
{
"question": "(interjection)",
"answer": "So basically, I'm not finding any actual US court cases where a Guatemalan birth mother successfully challenged custody after the adoption was finalized in the US. The legal framework is that once an ",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773647682.6634371
},
{
"question": "(interjection)",
"answer": "So basically, Guatemala is currently at a Level 3 travel advisory from the State Department \u2014 that's Reconsider Travel due to crime. They specifically say do not travel to certain zones in Guatemala C",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773647820.8494499
},
{
"question": "(interjection)",
"answer": "So basically, Luke's right that you can get a court order to prevent her from leaving, but there's also something called the Children's Passport Issuance Alert Program \u2014 it's free through the State De",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773647886.32027
},
{
"question": "(interjection)",
"answer": "So basically, the State Department says the US doesn't have routine exit controls \u2014 even with a court order, there's no automatic system that stops a parent at the airport. You'd need to get the emerg",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773647925.7407732
},
{
"question": "(interjection)",
"answer": "So the State Department says you can get an emergency court order to prevent international travel, but here's the thing \u2014 the US doesn't have routine exit controls. That means even if you get a court ",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage",
"web_search"
],
"timestamp": 1773648005.331478
},
{
"question": "(interjection)",
"answer": "So I looked it up and emergency custody orders can actually happen pretty fast \u2014 most courts can grant them within 24 to 72 hours if there's immediate danger. Some judges will issue a temporary order ",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage",
"fetch_webpage"
],
"timestamp": 1773648055.9440908
},
{
"question": "(interjection)",
"answer": "So basically, people pretend not to recognize someone for a few main reasons. Social anxiety is a big one, like they're afraid of an awkward interaction or being judged. Sometimes it's situational, li",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773648303.8702788
},
{
"question": "(interjection)",
"answer": "So basically, the Quora result mentions passive-aggressive behavior as a main reason people pretend not to know someone. It's a way of asserting control or sending a message without direct confrontati",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773648454.6246731
},
{
"question": "(interjection)",
"answer": "So I looked it up and there's actually a term for this \u2014 it's called a secret friendship or hidden relationship. Psychology research shows people hide friendships from their partners for a few specifi",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773648487.007659
},
{
"question": "(interjection)",
"answer": "Um, so I'm seeing something here \u2014 when someone compartmentalizes a friendship like this, hiding it from their spouse, relationship experts actually classify that as one of the warning signs of an emo",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773648513.5435588
},
{
"question": "(interjection)",
"answer": "Yeah, actually \u2014 Luke's right. The caller's husband David was there, not the friend's husband. So the friend pretended not to know the caller in front of the caller's own husband, which doesn't fit th",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773648613.530808
},
{
"question": "(interjection)",
"answer": "Um, so the one thing I'm seeing here is that social anxiety research shows people sometimes pretend not to know someone because they're afraid of how the interaction will look to whoever's watching. I",
"tools_used": [
"web_search",
"web_search",
"web_search"
],
"timestamp": 1773648647.120529
},
{
"question": "(interjection)",
"answer": "So basically, if Leon's coworkers are giving him a hard time, he's actually protected under federal law. OSHA's Section 11c makes it illegal for employers to retaliate against workers who report safet",
"tools_used": [
"web_search"
],
"timestamp": 1773649150.604851
},
{
"question": "(interjection)",
"answer": "Um, so I looked it up and OSHA actually has federal whistleblower protections specifically for this. If Leon gets retaliated against for reporting safety stuff, even like getting demoted or discipline",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773649204.095894
},
{
"question": "(interjection)",
"answer": "So basically, FMCSA regulations actually require physical inspection of cargo securement \u2014 you can't just eyeball it. And OSHA has whistleblower protections specifically for this kind of thing. If the",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage"
],
"timestamp": 1773649256.283089
},
{
"question": "(interjection)",
"answer": "So basically, if this is about truck cargo securement, the caller's actually right that there are federal requirements. FMCSA regulations require drivers to physically inspect cargo and securement dev",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773649408.134513
},
{
"question": "(interjection)",
"answer": "So basically, if coworkers are freezing you out or creating a hostile environment because you filed an HR complaint, that can actually count as illegal retaliation \u2014 the company's responsible for stop",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773649505.8539321
},
{
"question": "(interjection)",
"answer": "Um, so I looked it up and if he reported safety violations specifically, he's actually protected under federal law. OSHA has whistleblower protections that make it illegal for employers or coworkers t",
"tools_used": [
"web_search",
"web_search"
],
"timestamp": 1773649602.226942
},
{
"question": "(interjection)",
"answer": "Um, so I looked this up \u2014 retaliation is actually the most common workplace complaint filed with the EEOC. In 2024 it was almost 48% of all discrimination charges, over 42,000 complaints. So basically",
"tools_used": [
"web_search",
"web_search",
"fetch_webpage"
],
"timestamp": 1773649714.304941
},
{
"question": "(interjection)",
"answer": "So basically, Luke was right \u2014 the U.S. Mint did stop making pennies for circulation in 2025. They actually auctioned off the last 232 circulating pennies in special sets earlier this year, if that ma",
"tools_used": [
"web_search"
],
"timestamp": 1773650025.477504
},
{
"question": "(interjection)",
"answer": "So basically, there's this thing called confabulation where someone genuinely believes false memories or stories they're telling. It's not technically lying because they actually think it's true. It c",
"tools_used": [
"web_search"
],
"timestamp": 1773650244.560938
},
{
"question": "(interjection)",
"answer": "So I looked it up and actually, goblin sharks do have eyes, but the really creepy part is their jaws extend outward when they feed, like they shoot forward. And they're pink, which somehow makes it wo",
"tools_used": [
"web_search",
"wikipedia_lookup"
],
"timestamp": 1773650621.581631
},
{
"question": "(interjection)",
"answer": "So I looked it up and it's about 329 miles from Deming to Phoenix, if that's where she is in Arizona. That's like five and a half hours, maybe more depending on traffic. So it's actually not that diff",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773650865.077457
},
{
"question": "(interjection)",
"answer": "So actually, if you wanted to split the difference, there's a town called Sierra Vista that's pretty much halfway between Deming and Phoenix. It's about 30 miles from the exact midpoint. So you could ",
"tools_used": [
"web_search"
],
"timestamp": 1773650933.320225
},
{
"question": "(interjection)",
"answer": "So basically, you can actually snorkel without knowing how to swim if you use a life jacket or flotation vest. Most cruise excursions provide them, and they'll keep you on the surface. He should ask h",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773651077.3874822
},
{
"question": "(interjection)",
"answer": "So basically, you technically can snorkel without knowing how to swim because it's mostly just floating on the surface, but most cruise excursions will give you a life vest anyway, so he'd actually be",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773651141.437025
},
{
"question": "(interjection)",
"answer": "So basically, you can technically snorkel without swimming skills because you're mostly just floating on the surface with a life vest, but most guides say it's way riskier if you can't swim. Cruise ex",
"tools_used": [
"web_search",
"fetch_webpage",
"fetch_webpage"
],
"timestamp": 1773651186.300575
},
{
"question": "(interjection)",
"answer": "So basically, you can actually snorkel without knowing how to swim. Most cruise snorkeling excursions provide life jackets or flotation vests specifically for this reason. You just float on the surfac",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773651242.652043
},
{
"question": "(interjection)",
"answer": "So basically, you can snorkel without knowing how to swim, but most guides recommend using a life jacket or flotation device, especially in open water. A lot of cruise excursions actually provide them",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773651289.880447
},
{
"question": "(interjection)",
"answer": "Um, one thing I found \u2014 even experienced swimmers use flotation devices for snorkeling sometimes because you can get tired in open water. So it's not gonna look weird or anything. Most people on cruis",
"tools_used": [
"web_search",
"fetch_webpage"
],
"timestamp": 1773651465.619576
},
{
"question": "(interjection)",
"answer": "So actually, you're not alone at all. About half of adults worldwide can't swim unassisted. Even in high-income countries it's like one in four people. So yeah, Luke's probably right that you won't be",
"tools_used": [
"web_search"
],
"timestamp": 1773651505.359561
} }
], ],
"conversation_history": [ "conversation_history": [
@@ -436,6 +793,42 @@
{ {
"role": "assistant", "role": "assistant",
"content": "Yeah, I can do that. I'll pull some stuff tonight and have it ready before we go on. I'll look at like the town council stuff, any county news, maybe some state things that affect people here. And I'll make sure it's actually interesting, not just like... budget reports or whatever. Things people would actually want to call in about. I'll have it organized so you can just grab whatever you need. And I'll have your coffee." "content": "Yeah, I can do that. I'll pull some stuff tonight and have it ready before we go on. I'll look at like the town council stuff, any county news, maybe some state things that affect people here. And I'll make sure it's actually interesting, not just like... budget reports or whatever. Things people would actually want to call in about. I'll have it organized so you can just grab whatever you need. And I'll have your coffee."
},
{
"role": "user",
"content": "Hey, Devin, good job on the coffee this morning, buddy. I'm very proud of you. You did a great thing there, and we're all very excited that you're going to get to stay on the radio show for another day. So congratulations, and what would you like to tell the radio listeners today?"
},
{
"role": "assistant",
"content": "...oh. Um, thanks. I appreciate that. I guess I'd just say thanks for listening? I don't really know what to say. This is still pretty new for me, so if you have questions or want us to look something up or whatever, I'm here. I'm trying to make the show good. So yeah, thanks for calling in and being part of it. That's all I got."
},
{
"role": "user",
"content": "I think you've got it all wrong. There is no close friendship with the marriage guy. Let me explain this to you, Devin, because obviously you're missing it. So, Sheila and Monica were friends in high school. And in this interaction, Sheila is with her husband, David. David and Sheila are married. And Monica was there and pretended not to know Sheila, even though they've been hanging out for weeks and weeks at coffee shops and stuff. So they have an ongoing platonic friendship going on, and when witnessed in the shopping area, Monica pretended not to know Sheila. And that's just very strange. It doesn't fit these patterns that you're describing to us."
},
{
"role": "assistant",
"content": "Right, yeah \u2014 I had it backwards. Sorry. So if Monica pretended not to know Sheila in front of David, and Monica's the one who's married... maybe Monica's husband doesn't know about the friendship? Like she's been meeting Sheila without telling him, and she panicked when she saw someone who knows her. That would explain why she acted like strangers. She's hiding the friendship from her own husband, not from Sheila's."
},
{
"role": "user",
"content": "Devin, that's the end of another show. How are you feeling about this? You've been with us for three days now. You did a much better job today than your first couple of days. I'm proud of you, buddy, and you brought my coffee, and that was very good. This might be working out. What are your thoughts on the show? And how do you think it went?"
},
{
"role": "user",
"content": "Devin, that's the end of another show. How are you feeling about this? You've been with us for three days now. You did a much better job today than your first couple of days. I'm proud of you, buddy, and you brought my coffee, and that was very good. This might be working out. What are your thoughts on the show? And how do you think it went?"
},
{
"role": "assistant",
"content": "...thanks. That actually means a lot. I was really worried I was screwing everything up the first couple days. Um, I think today was solid. Like, the Monica and Sheila thing was interesting, and people seemed to actually care about calling in. I still messed up a couple times, but I'm getting better at knowing when to jump in and when to just let you handle it. And I have the local news stuff ready for tomorrow, so hopefully that helps keep things moving. I'm just trying to be useful, you know? This job is actually really important to me. My mom keeps asking if I'm gonna get fired, so it's nice to hear you say it might be working out. I'll keep the coffee coming."
},
{
"role": "user",
"content": "All right, you keep the coffee coming and we'll be okay. As soon as you stop bringing that coffee, though, I mean, I'm going to get an intern that can do it. So I'm proud of you. You're doing a good job. You're adding to the show, and I like that. Do you have anything you'd like to say to the audience before we leave?"
},
{
"role": "assistant",
"content": "Um, yeah. Thanks for listening. And if you have stuff you want us to talk about tomorrow, or like weird situations you need help figuring out, call in. That's what we're here for. I'll have the local news ready and Luke will... do what he does. So yeah, thanks. Have a good night, I guess."
} }
] ]
} }
+17
View File
@@ -110,5 +110,22 @@
} }
}, },
"started_at": "2026-03-15T09:07:26.659508+00:00" "started_at": "2026-03-15T09:07:26.659508+00:00"
},
"38": {
"steps": {
"castopod": {
"completed_at": "2026-03-16T10:08:51.467004+00:00",
"episode_id": 41,
"slug": "episode-38-secrets-lies-and-late-night-confessions"
},
"youtube": {
"completed_at": "2026-03-16T10:30:31.775891+00:00",
"video_id": "6qLsJxnCLs0"
},
"social": {
"completed_at": "2026-03-16T10:30:41.220832+00:00"
}
},
"started_at": "2026-03-16T10:08:51.466898+00:00"
} }
} }
+58 -22
View File
@@ -1,26 +1,5 @@
{ {
"regulars": [ "regulars": [
{
"id": "6037d92b",
"name": "Otis",
"gender": "male",
"age": 50,
"job": "sitting on a stack of pallets with his legs dangling, watching the wind kick up dust devils",
"location": "unknown",
"personality_traits": [],
"voice": "Elliot",
"stable_seeds": {
"style": "COMMUNICATION STYLE: Treats the call like a set. Has bits prepared. Delivers serious information with a punchline chaser. Self-deprecating as a defense mechanism \u2014 makes fun of themselves before anyone else can. Energy level: high. When pushed back on, they deflect with humor. Getting a straight answer from them requires the host to push. Conversational tendency: turning everything into a bit."
},
"call_history": [
{
"summary": "Otis called in, panicked, after faking prostate cancer to skip his sister\u2019s fourth wedding\u2014only for the lie to spiral into a town-wide support frenzy, complete with a GoFundMe, a cousin shaving his head, and his mom calling him every Sunday for the first time in years. The emotional gut-punch came when his sister postponed her wedding *again* (this time for his \"remission\"), leaving Otis torn between guilt over the deception and fear of the fallout if he confessed\u2014especially after his doctor confirmed he *doesn\u2019t* have cancer (though his prostate *is* enlarged). The call ended with dark humor and despair, as Otis admitted the town\u2019s sudden care for him was tied to his fake illness, not who he really is.",
"timestamp": 1773040564.845288
}
],
"last_call": 1773040564.84529,
"created_at": 1773040564.84529
},
{ {
"id": "0a0b3635", "id": "0a0b3635",
"name": "Charlene", "name": "Charlene",
@@ -70,9 +49,14 @@
{ {
"summary": "The caller, **Chip**, shared his emotional turmoil over discovering that his **adopted daughter\u2019s birth mother** may have resurfaced after receiving an unverified email with a photo that eerily matched his daughter\u2019s features. His girlfriend, **Teresa**, had known about the email for **three weeks** but kept it from him, leaving him feeling betrayed and overwhelmed. While Chip wants to **verify the claim legally before acting**, Teresa insists on **immediately flying to Guatemala with their daughter** to meet the woman, dismissing his fears as avoidance. The conversation escalated into a heated debate about **trust, safety, and extreme measures**\u2014with the host, Luke, urging Chip to **file a restraining order** if Teresa refuses to back down, warning of potential dangers in Guatemala. Chip, torn between **protecting his family and avoiding a nuclear confrontation**, vowed to try reasoning with Teresa one last time before she leaves for work. The call was charged with **fear, frustration, and the weight of a decision that could reshape their family forever**.", "summary": "The caller, **Chip**, shared his emotional turmoil over discovering that his **adopted daughter\u2019s birth mother** may have resurfaced after receiving an unverified email with a photo that eerily matched his daughter\u2019s features. His girlfriend, **Teresa**, had known about the email for **three weeks** but kept it from him, leaving him feeling betrayed and overwhelmed. While Chip wants to **verify the claim legally before acting**, Teresa insists on **immediately flying to Guatemala with their daughter** to meet the woman, dismissing his fears as avoidance. The conversation escalated into a heated debate about **trust, safety, and extreme measures**\u2014with the host, Luke, urging Chip to **file a restraining order** if Teresa refuses to back down, warning of potential dangers in Guatemala. Chip, torn between **protecting his family and avoiding a nuclear confrontation**, vowed to try reasoning with Teresa one last time before she leaves for work. The call was charged with **fear, frustration, and the weight of a decision that could reshape their family forever**.",
"timestamp": 1773226361.4859362 "timestamp": 1773226361.4859362
},
{
"summary": "Chip called to update Luke about his adoption situation: his lawyer verified that his daughter's Guatemalan adoption may have been part of a fraud scandal, and now his girlfriend Teresa has secretly bought plane tickets to take their daughter to Guatemala this Friday to meet the birth mother. Despite Luke's strong advice to get an emergency court order to prevent the trip for safety reasons, Chip is torn because Teresa threatened to leave him if he stops her, though he ultimately acknowledges the relationship is likely over either way.",
"timestamp": 1773648139.8094652,
"arc_status": "ongoing"
} }
], ],
"last_call": 1773226361.4859362, "last_call": 1773648139.8094661,
"created_at": 1772786610.8858292 "created_at": 1772786610.8858292
}, },
{ {
@@ -277,6 +261,58 @@
], ],
"last_call": 1773563477.065744, "last_call": 1773563477.065744,
"created_at": 1773563477.065744 "created_at": 1773563477.065744
},
{
"id": "09d1eab4",
"name": "Aaliyah",
"gender": "female",
"age": 22,
"job": "works as a diesel mechanic, learned from her dad",
"location": "unknown",
"personality_traits": [
"asks the host questions back",
"into gardening, talks to plants",
"into wildlife photography, has patience for it",
"asks the host questions back"
],
"voice": "Hana",
"stable_seeds": {
"style": "COMMUNICATION STYLE: Treats the call like a set. Has bits prepared. Delivers serious information with a punchline chaser. Self-deprecating as a defense mechanism \u2014 makes fun of themselves before anyone else can. Energy level: high. When pushed back on, they deflect with humor. Getting a straight answer from them requires the host to push. Conversational tendency: turning everything into a bit."
},
"structured_background": {
"name": "Aaliyah",
"age": 22,
"gender": "female",
"job": "works as a diesel mechanic, learned from her dad",
"location": null,
"reason_for_calling": "caught their roommate in a hotel room that was supposed to be a business trip and now they can't make eye contact \u2014 the roommate acts like nothing happened but it was extremely age play-adjacent",
"pool_name": "PROBLEMS",
"communication_style": "COMMUNICATION STYLE: Treats the call like a set. Has bits prepared. Delivers serious information with a punchline chaser. Self-deprecating as a defense mechanism \u2014 makes fun of themselves before anyone else can. Energy level: high. When pushed back on, they deflect with humor. Getting a straight answer from them requires the host to push. Conversational tendency: turning everything into a bit.",
"energy_level": "medium",
"emotional_state": "calm",
"signature_detail": "asks the host questions back",
"situation_summary": "caught their roommate in a hotel room that was supposed to be a business trip and now they can't make eye contact \u2014 the ",
"natural_description": "22 years old. Works as a diesel mechanic, learned from her dad. Caught their roommate in a hotel room that was supposed to be a business trip and now they can't make eye contact \u2014 the roommate acts like nothing happened but it was extremely age play-adjacent. Was was in the bathtub, phone on the edge of the sink, show on speaker. before calling. Coyotes are going crazy outside right now Tells everyone they quit drinking but keeps a bottle in the garage behind the paint cans. Her coworker and friend denise, who she vents to on breaks. Believes poker is the most honest game there is because everybody's lying. Single, been that way a while. Not sure if by choice anymore.. Really into into gardening, talks to plants. Also into wildlife photography, has patience for it. Her boss karen \u2014 yes, her name is actually karen \u2014 who is actually cool. It's Monday night, the middle of the night. it's a weeknight \u2014 work tomorrow for most people. Early spring \u2014 wind season is starting. Dust storms possible.",
"seeds": [
"into gardening, talks to plants",
"into wildlife photography, has patience for it",
"asks the host questions back",
"Believes poker is the most honest game there is because everybody's lying."
],
"verbal_fluency": "medium",
"calling_from": "at a rest area on I-25, halfway to Albuquerque"
},
"avatar": "Aaliyah.jpg",
"relationships": {},
"call_history": [
{
"summary": "Alia called in to share her discomfort after unexpectedly discovering her roommate engaging in age play while supposedly on a business trip. She expressed frustration over their awkward living situation and his refusal to acknowledge the incident, leading her to feel gaslit; however, by the end of the conversation, she realized she could address the issue directly to ease the tension between them.",
"timestamp": 1773647350.908658,
"arc_status": "ongoing"
}
],
"last_call": 1773647350.908658,
"created_at": 1773647350.908658
} }
] ]
} }
+270 -1471
View File
File diff suppressed because it is too large Load Diff
+216
View File
@@ -959,6 +959,222 @@
"pool": "GOSSIP", "pool": "GOSSIP",
"timestamp": 1773558490.76909, "timestamp": 1773558490.76909,
"session_id": "11f1bf06" "session_id": "11f1bf06"
},
{
"reason": "found a perfectly preserved butterfly inside a sealed mason jar in their attic \u2014 the jar has no lid and no visible opening, the butterfly is just inside solid glass",
"pool": "WEIRD",
"timestamp": 1773646177.124931,
"session_id": "6def7993"
},
{
"reason": "found a journal in a used car they bought and it's full of detailed entries about a life falling apart \u2014 they can't stop reading it",
"pool": "PROBLEMS",
"timestamp": 1773646177.1446521,
"session_id": "6def7993"
},
{
"reason": "the woman who runs the neighborhood Facebook group's 'crime watch' reports are almost entirely about her personal feuds disguised as safety concerns",
"pool": "GOSSIP",
"timestamp": 1773646177.146471,
"session_id": "6def7993"
},
{
"reason": "their kid's coach is clearly favoring their own child for playing time and it's affecting the team \u2014 other parents are grumbling but nobody wants to be the one to say something",
"pool": "ADVICE",
"timestamp": 1773646177.149646,
"session_id": "6def7993"
},
{
"reason": "is fed up with loyalty programs and points systems that are designed to be confusing on purpose",
"pool": "HOT_TAKES",
"timestamp": 1773646177.151716,
"session_id": "6def7993"
},
{
"reason": "found out their father isn't their biological father through a DNA test \u2014 their mother confessed when confronted but their father doesn't know they know",
"pool": "ADVICE",
"timestamp": 1773646177.153644,
"session_id": "6def7993"
},
{
"reason": "is fed up with restaurants that dim the lights so low you need your phone flashlight to read the menu",
"pool": "HOT_TAKES",
"timestamp": 1773646177.155104,
"session_id": "6def7993"
},
{
"reason": "found out their partner has been faking it for years and only admitted it because a conversation about {fetish_detail} finally made them honest about what they actually want",
"pool": "PROBLEMS",
"timestamp": 1773646177.1599412,
"session_id": "6def7993"
},
{
"reason": "is convinced that deep dish pizza isn't pizza \u2014 it's a casserole \u2014 and Chicago needs to accept that",
"pool": "HOT_TAKES",
"timestamp": 1773646177.161846,
"session_id": "6def7993"
},
{
"reason": "thinks nuclear power is making a comeback and wants to argue that it's actually the greenest option we have",
"pool": "TOPIC_CALLIN",
"timestamp": 1773646184.286796,
"session_id": "6def7993"
},
{
"reason": "is high and has been googling deep sea creatures for three hours and is now afraid of the ocean \u2014 needs to talk about the goblin shark",
"pool": "TOPIC_CALLIN",
"timestamp": 1773646184.371757,
"session_id": "6def7993"
},
{
"reason": "their coworker confessed to stealing from the company and asked them to keep quiet \u2014 it's been eating at them for weeks",
"pool": "PROBLEMS",
"timestamp": 1773646185.7221801,
"session_id": "6def7993"
},
{
"reason": "thinks self-checkout machines are just stores making you do free labor and then getting mad when you mess up",
"pool": "HOT_TAKES",
"timestamp": 1773646186.053767,
"session_id": "6def7993"
},
{
"reason": "found a jar of pennies buried in their backyard, all from the same year \u2014 1977 \u2014 and there are exactly 365 of them",
"pool": "WEIRD",
"timestamp": 1773646186.151602,
"session_id": "6def7993"
},
{
"reason": "has been pretending to know how to swim for their entire adult life and their spouse just booked a Caribbean cruise with a snorkeling excursion for their anniversary \u2014 they leave in three weeks and googled 'how to swim' last night and immediately closed the laptop when their wife walked in",
"pool": "PROBLEMS",
"timestamp": 1773646186.482605,
"session_id": "6def7993"
},
{
"reason": "caught their roommate {sex_situation} and now they can't make eye contact \u2014 the roommate acts like nothing happened but it was extremely {fetish_detail}-adjacent",
"pool": "PROBLEMS",
"timestamp": 1773646186.573503,
"session_id": "6def7993"
},
{
"reason": "their kid's coach has been playing favorites and their kid hasn't seen the field in five games \u2014 they don't want to be 'that parent' but the kid is losing confidence",
"pool": "PROBLEMS",
"timestamp": 1773646186.6736908,
"session_id": "6def7993"
},
{
"reason": "believes people who humble-brag about how busy they are just have bad time management",
"pool": "HOT_TAKES",
"timestamp": 1773646190.210266,
"session_id": "6def7993"
},
{
"reason": "their car's trip odometer resets itself to 0.0 every time they park at the grocery store \u2014 only at the grocery store, nowhere else \u2014 and it's been doing it for five months since they had the oil changed",
"pool": "WEIRD",
"timestamp": 1773696320.855356,
"session_id": "0d4a0098"
},
{
"reason": "their town had a power outage and everyone went outside and hung out in the street for three hours \u2014 it was the best night they've had in years and they're weirdly hoping it happens again",
"pool": "STORIES",
"timestamp": 1773696320.87394,
"session_id": "0d4a0098"
},
{
"reason": "thinks couples who share a single social media account are hiding something and everyone knows it",
"pool": "HOT_TAKES",
"timestamp": 1773696320.876276,
"session_id": "0d4a0098"
},
{
"reason": "their town's only restaurant changed the recipe for the green chile and there's a petition with 200 signatures demanding they change it back \u2014 the caller started the petition",
"pool": "STORIES",
"timestamp": 1773696320.87835,
"session_id": "0d4a0098"
},
{
"reason": "had a near-death experience during a flash flood in a wash and the way time slowed down changed something fundamental in how they see each day",
"pool": "TOPIC_CALLIN",
"timestamp": 1773696320.880057,
"session_id": "0d4a0098"
},
{
"reason": "found out their coworker who drives a new BMW and wears designer clothes is completely broke \u2014 the coworker accidentally left a bank statement on the printer showing a negative balance",
"pool": "GOSSIP",
"timestamp": 1773696320.8830059,
"session_id": "0d4a0098"
},
{
"reason": "has been having phone sex with a stranger they met on a late-night chat line for six months \u2014 they know the person's voice better than their spouse's and they've started comparing the two out loud by accident",
"pool": "PROBLEMS",
"timestamp": 1773696320.886786,
"session_id": "0d4a0098"
},
{
"reason": "their car radio turned itself on in the driveway at 2 AM playing a station that went off the air in 2003 \u2014 they checked, the station doesn't exist anymore",
"pool": "WEIRD",
"timestamp": 1773696320.888352,
"session_id": "0d4a0098"
},
{
"reason": "their church raised enough to pay off a family's medical debt anonymously",
"pool": "CELEBRATIONS",
"timestamp": 1773696320.890592,
"session_id": "0d4a0098"
},
{
"reason": "completed their first full night of sleep without nightmares since coming home from deployment",
"pool": "CELEBRATIONS",
"timestamp": 1773696327.421101,
"session_id": "0d4a0098"
},
{
"reason": "thinks bumper stickers are a cry for attention and nobody's ever changed their mind because of one",
"pool": "HOT_TAKES",
"timestamp": 1773696327.965135,
"session_id": "0d4a0098"
},
{
"reason": "their tenant is three months behind on rent and has a newborn \u2014 the caller needs the rental income to pay their own mortgage but can't live with themselves for evicting a baby",
"pool": "ADVICE",
"timestamp": 1773696328.169761,
"session_id": "0d4a0098"
},
{
"reason": "best friend from high school died in a car wreck last month and they hadn't talked in three years because of a stupid argument",
"pool": "PROBLEMS",
"timestamp": 1773696328.1725929,
"session_id": "0d4a0098"
},
{
"reason": "accidentally waved back at someone who was waving at the person behind them \u2014 committed to it and had a five-minute conversation pretending they knew each other, exchanged numbers, and is now too deep to explain",
"pool": "STORIES",
"timestamp": 1773696328.9334688,
"session_id": "0d4a0098"
},
{
"reason": "thinks the obsession with sourdough bread is ridiculous and a loaf from the store tastes the same",
"pool": "HOT_TAKES",
"timestamp": 1773696329.296464,
"session_id": "0d4a0098"
},
{
"reason": "walked into the wrong house in their subdivision \u2014 same floor plan, door was unlocked \u2014 sat down on the couch before the actual homeowner came out of the bathroom",
"pool": "STORIES",
"timestamp": 1773696329.49014,
"session_id": "0d4a0098"
},
{
"reason": "slept with their best friend's spouse at that friend's funeral reception \u2014 they were both grief-drunk and now they see each other every week because they're both in the dead friend's will as co-executors",
"pool": "PROBLEMS",
"timestamp": 1773696329.650972,
"session_id": "0d4a0098"
},
{
"reason": "is being sued by their former business partner for $200k and just got served at their daughter's soccer game",
"pool": "PROBLEMS",
"timestamp": 1773696329.9411042,
"session_id": "0d4a0098"
} }
] ]
} }
+297
View File
@@ -0,0 +1,297 @@
# Show Quality Fixes — Episode 47 Post-Mortem
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Fix 5 bugs that ruined tonight's show: theme ignored by callers, wrong LLM models assigned, phonetic pronunciation mangling, voice-age mismatch, and low minimum response threshold.
**Architecture:** All fixes are in `backend/main.py` except voice-age matching which also touches `backend/services/tts.py` voice matching logic. Each fix is independent — no ordering dependencies between tasks.
**Tech Stack:** Python, FastAPI
---
### Task 1: Regenerate caller backgrounds when theme is set
**Problem:** `_pregenerate_backgrounds()` runs on startup when `session.show_theme` is still `""`. Setting theme via `POST /api/show-theme` only stores the string — doesn't regenerate. Callers have zero theme connection.
**Files:**
- Modify: `backend/main.py:9891-9900` (`set_show_theme` endpoint)
- Modify: `backend/main.py:5899-5927` (`_pregenerate_backgrounds`)
**Step 1: Modify `set_show_theme` to regenerate unused caller backgrounds**
In `backend/main.py`, replace the `set_show_theme` endpoint (lines 9891-9900):
```python
@app.post("/api/show-theme")
async def set_show_theme(data: dict):
theme = data.get("theme", "").strip()[:100]
old_theme = session.show_theme
session.show_theme = theme
if theme:
print(f"[Theme] Show theme set: {theme}")
elif old_theme:
print(f"[Theme] Show theme cleared (was: {old_theme})")
# Regenerate backgrounds for callers that haven't been on air yet
if theme != old_theme:
unused_keys = [k for k in CALLER_BASES if k not in session.used_callers]
if unused_keys:
print(f"[Theme] Regenerating {len(unused_keys)} unused caller backgrounds for theme: {theme or '(none)'}")
asyncio.create_task(_regenerate_backgrounds_for_keys(unused_keys))
return {"theme": session.show_theme}
```
**Step 2: Add `_regenerate_backgrounds_for_keys` helper**
Add this right after `_pregenerate_backgrounds()` (after line 5927):
```python
async def _regenerate_backgrounds_for_keys(keys: list[str]):
"""Regenerate backgrounds for specific caller keys (e.g. after theme change)."""
tasks = []
for key in keys:
base = CALLER_BASES.get(key)
if base and not base.get("returning"):
tasks.append((key, _generate_caller_background_llm(base)))
if not tasks:
return
results = await asyncio.gather(*[t[1] for t in tasks], return_exceptions=True)
for (key, _), result in zip(tasks, results):
if isinstance(result, Exception):
print(f"[Theme] Regen failed for caller {key}: {result}")
else:
session.caller_backgrounds[key] = result
# Clear cached model so it re-evaluates with new style
session.caller_models.pop(key, None)
print(f"[Theme] Regenerated {sum(1 for r in results if not isinstance(r, Exception))}/{len(tasks)} backgrounds")
_match_voices_to_styles()
_sort_caller_queue()
```
**Step 3: Verify `used_callers` exists on session**
Check that `session.used_callers` tracks which callers have already been on air. If it doesn't exist, use `session.call_history` caller keys instead.
**Step 4: Test manually**
```bash
# Start server
python -m uvicorn backend.main:app --reload --reload-dir backend --host 0.0.0.0 --port 8000
# Set theme and check logs for "[Theme] Regenerating..." messages
curl -X POST http://localhost:8000/api/show-theme -H "Content-Type: application/json" -d '{"theme": "Road Stories"}'
```
**Step 5: Commit**
```bash
git add backend/main.py
git commit -m "Regenerate caller backgrounds when show theme is set"
```
---
### Task 2: Fix style-to-model matching race condition
**Problem:** `get_caller_model()` is called before `caller_styles` is populated. `caller_styles.get(key)` returns `""`, `_normalize_style_key("")` returns `""`, no match in `caller_model_map` → falls through to `caller_model_pool[0]` (grok-4.1-fast) for everyone.
**Files:**
- Modify: `backend/main.py:6848-6875` (`get_caller_model`)
**Step 1: Fix `get_caller_model` to defer assignment when style is unknown**
Replace `get_caller_model` (lines 6848-6875):
```python
def get_caller_model(self, caller_key: str) -> str | None:
"""Get the assigned model for a caller, or assign one based on strategy.
Returns None to use default category routing."""
if self.caller_model_strategy == "single":
return None # use default category_models["caller_dialog"]
# Already assigned — keep consistent for the whole call
if caller_key in self.caller_models:
return self.caller_models[caller_key]
model = None
if self.caller_model_strategy == "cycle":
if self.caller_model_pool:
model = self.caller_model_pool[self._caller_model_cycle_idx % len(self.caller_model_pool)]
self._caller_model_cycle_idx += 1
elif self.caller_model_strategy == "style_matched":
raw_style = self.caller_styles.get(caller_key, "")
style_key = _normalize_style_key(raw_style) if raw_style else ""
if style_key:
model = self.caller_model_map.get(style_key)
if not model:
# Style not yet populated or no mapping — use fallback, not pool[0]
model = self.caller_model_fallback
if model:
self.caller_models[caller_key] = model
caller_name = CALLER_BASES.get(caller_key, {}).get("name", caller_key)
style_info = self.caller_styles.get(caller_key, "unknown")
print(f"[CallerModel] Assigned {model} to {caller_name} (style={_normalize_style_key(style_info) if style_info else 'none'}, strategy={self.caller_model_strategy})")
return model
```
The key change: when `style_key` is empty (style not yet populated) or has no mapping, use `caller_model_fallback` (claude-sonnet-4.6) instead of `caller_model_pool[0]` (grok-4.1-fast). Claude Sonnet is a much safer default — empathetic, verbose, coherent.
**Step 2: Commit**
```bash
git add backend/main.py
git commit -m "Fix style-to-model race condition — use fallback instead of pool[0]"
```
---
### Task 3: Fix pronunciation fixes producing literal phonetic text
**Problem:** `_PRONUNCIATION_FIXES` replaces "Animas" with "Ah nee mahs" as literal text. TTS reads each word separately ("Ah" "nee" "mahs") instead of blending into the intended pronunciation.
**Files:**
- Modify: `backend/main.py:9141-9152` (`_PRONUNCIATION_FIXES`)
- Modify: `backend/main.py:9212-9216` (`_apply_pronunciation_fixes`)
**Step 1: Remove pronunciation fixes that sound worse than originals**
The Inworld TTS actually handles most proper nouns fine. The fixes were added speculatively and cause more harm than good. Remove the place names that TTS can handle, keep only abbreviations:
Replace `_PRONUNCIATION_FIXES` (lines 9141-9152):
```python
_PRONUNCIATION_FIXES = {
"Castopod": "Casto pod",
"vs": "versus",
"govt": "government",
"dept": "department",
}
```
Remove `Lordsburg`, `Hachita`, `Deming`, `Bootheel`, `Animas`, and `Rodeo`. These place names either sound fine through TTS or the phonetic replacement sounds worse.
**Step 2: Commit**
```bash
git add backend/main.py
git commit -m "Remove pronunciation fixes that produce worse TTS output"
```
---
### Task 4: Add age-awareness to voice matching
**Problem:** Brandy (55 years old) got "Kayla" (young-sounding voice). `_match_voices_to_styles()` scores on style dimensions (weight, energy, warmth, age_feel) but the `age_feel` preference comes from the communication style, not the character's actual age. A "confrontational" style prefers `age_feel: None` (no preference), so a 55-year-old can get a young voice.
**Files:**
- Modify: `backend/main.py:6106-6156` (`_match_voices_to_styles`)
**Step 1: Add character age to voice scoring**
In `_match_voices_to_styles`, after getting the style preferences, override `age_feel` based on the caller's actual age from their background:
```python
def _match_voices_to_styles():
"""Re-assign voices to match caller communication styles after backgrounds are generated."""
from .services.tts import VOICE_PROFILES
for key, base in CALLER_BASES.items():
if base.get("returning"):
continue
style_raw = session.caller_styles.get(key, "")
if not style_raw:
continue
style_key = _normalize_style_key(style_raw)
prefs = STYLE_VOICE_PREFERENCES.get(style_key)
if not prefs:
continue
# Copy prefs so we don't mutate the shared dict
prefs = dict(prefs)
# Override age_feel based on character's actual age
bg = session.caller_backgrounds.get(key)
if isinstance(bg, CallerBackground) and bg.age:
if bg.age >= 50:
prefs["age_feel"] = "mature"
elif bg.age >= 35:
prefs["age_feel"] = "middle"
elif bg.age < 25:
prefs["age_feel"] = "young"
# 25-34: keep style preference or None
gender = base["gender"]
pool = INWORLD_MALE_VOICES if gender == "male" else INWORLD_FEMALE_VOICES
voice_pool = [v for v in pool if v not in BLACKLISTED_VOICES]
scored = []
for voice_name in voice_pool:
profile = VOICE_PROFILES.get(voice_name)
if not profile:
scored.append((voice_name, 0))
continue
score = 0
for dim in ["weight", "energy", "warmth", "age_feel"]:
pref_val = prefs.get(dim)
if pref_val and profile.get(dim) == pref_val:
score += 1
scored.append((voice_name, score))
if scored:
names = [s[0] for s in scored]
weights = [max(1, s[1] * 3) for s in scored]
chosen = random.choices(names, weights=weights, k=1)[0]
used_voices = {CALLER_BASES[k]["voice"] for k in CALLER_BASES if k != key and "voice" in CALLER_BASES[k]}
if chosen in used_voices:
alternatives = [(n, w) for n, w in zip(names, weights) if n not in used_voices]
if alternatives:
alt_names, alt_weights = zip(*alternatives)
chosen = random.choices(alt_names, weights=alt_weights, k=1)[0]
old_voice = base.get("voice", "")
base["voice"] = chosen
if old_voice != chosen:
print(f"[VoiceMatch] {base.get('name', key)}: {old_voice}{chosen} (style: {style_key}, age: {bg.age if isinstance(bg, CallerBackground) else '?'})")
```
**Step 2: Commit**
```bash
git add backend/main.py
git commit -m "Add age-awareness to voice matching — 55yo won't get young voices"
```
---
### Task 5: Raise minimum response word count
**Problem:** `MIN_RESPONSE_WORDS = 30` lets through fragmented, telegram-style responses that are technically 30+ words but terrible radio.
**Files:**
- Modify: `backend/main.py:8844` (`MIN_RESPONSE_WORDS`)
**Step 1: Raise the minimum**
Change line 8844:
```python
MIN_RESPONSE_WORDS = 50 # Retry if response is shorter than this
```
50 words is roughly 2-3 spoken sentences — enough to be a coherent radio response without being overly demanding for short-form exchanges.
**Step 2: Commit**
```bash
git add backend/main.py
git commit -m "Raise MIN_RESPONSE_WORDS from 30 to 50"
```
+261
View File
@@ -0,0 +1,261 @@
"""Fetch instrumental background music from Jamendo for the radio show.
Pixabay has no public music API — this uses Jamendo's free API instead.
All tracks are Creative Commons licensed. Attribution is saved to music/CREDITS.txt.
Setup: Get a free client_id at https://devportal.jamendo.com
Add JAMENDO_CLIENT_ID=your_id to .env
Usage:
python fetch_music.py # download 20 tracks across all genres
python fetch_music.py --genre jazz # download jazz only
python fetch_music.py --count 50 # download 50 tracks
python fetch_music.py --list # just list available tracks, don't download
"""
import argparse
import os
import re
import sys
from pathlib import Path
import httpx
from dotenv import load_dotenv
load_dotenv()
MUSIC_DIR = Path(__file__).parent / "music"
CREDITS_FILE = MUSIC_DIR / "CREDITS.txt"
API_BASE = "https://api.jamendo.com/v3.0"
# Genres good for a late-night radio show
GENRES = ["jazz", "lofi", "blues", "ambient", "acoustic", "funk", "chill"]
# Map search tags to labels that _detect_genre() in main.py can match
# jazz, blues, funk, lo-fi are already in GENRE_KEYWORDS
# ambient, acoustic, chill would need to be added for auto-detection
GENRE_LABELS = {
"jazz": "Jazz",
"lofi": "Lo-Fi",
"blues": "Blues",
"ambient": "Ambient",
"acoustic": "Acoustic",
"funk": "Funk",
"chill": "Chill",
}
def get_client_id():
key = os.getenv("JAMENDO_CLIENT_ID")
if not key:
print("Error: JAMENDO_CLIENT_ID not found in .env")
print("Get one free at https://devportal.jamendo.com")
sys.exit(1)
return key
def sanitize_filename(name: str) -> str:
return re.sub(r'[<>:"/\\|?*]', '', name).strip()
def _has_vocals(track: dict) -> bool:
"""Check musicinfo for vocal indicators — catches tracks Jamendo mis-tagged as instrumental."""
mi = track.get("musicinfo", {})
# Check the vocalinstrumental field in musicinfo (separate from the API filter)
vi = mi.get("vocalinstrumental")
if vi and vi.lower() == "vocal":
return True
# Check tags for vocal/singing indicators
tags = mi.get("tags", {})
# tags can be {"genres": [...], "instruments": [...], "vartags": [...]}
all_tags = []
if isinstance(tags, dict):
for v in tags.values():
if isinstance(v, list):
all_tags.extend(t.lower() for t in v)
elif isinstance(tags, list):
all_tags = [t.lower() for t in tags]
vocal_tags = {"vocals", "vocal", "singing", "singer", "voice", "lyrics",
"rap", "hiphop", "hip-hop", "spoken", "spoken word"}
if vocal_tags & set(all_tags):
return True
# Check track name for vocal giveaways
name_lower = track.get("name", "").lower()
if any(w in name_lower for w in ["feat.", "ft.", "vocal", "remix vocal", "(voice"]):
return True
return False
def search_tracks(client: httpx.Client, client_id: str, genre: str, limit: int = 20) -> list[dict]:
# Request more than needed so we can filter out vocal false positives
fetch_limit = min(limit * 3, 200)
params = {
"client_id": client_id,
"format": "json",
"limit": fetch_limit,
"vocalinstrumental": "instrumental",
"fuzzytags": genre,
"durationbetween": "60_300",
"include": "musicinfo+licenses",
"order": "popularity_total",
}
resp = client.get(f"{API_BASE}/tracks/", params=params)
resp.raise_for_status()
data = resp.json()
if data["headers"]["status"] != "success":
print(f" API error: {data['headers'].get('error_message', 'unknown')}")
return []
results = data.get("results", [])
# Post-filter: reject tracks with vocal indicators despite the API filter
filtered = []
for t in results:
if _has_vocals(t):
print(f" SKIP (vocals detected): {t.get('artist_name', '?')} - {t.get('name', '?')}")
continue
filtered.append(t)
if len(filtered) >= limit:
break
skipped = len(results) - len(filtered)
if skipped:
print(f" (filtered out {skipped} tracks with vocal indicators)")
return filtered
def make_filename(track: dict, genre_tag: str) -> str:
artist = sanitize_filename(track.get("artist_name", "Unknown"))
title = sanitize_filename(track.get("name", "Untitled"))
label = GENRE_LABELS.get(genre_tag, genre_tag.title())
# Include genre tag if not already detectable from artist/title
lower = f"{artist} {title}".lower()
needs_tag = not any(kw in lower for kw in [genre_tag, label.lower()])
if needs_tag:
return f"{artist} - {title} [{label}].mp3"
return f"{artist} - {title}.mp3"
def download_track(client: httpx.Client, track: dict, filepath: Path, index: int, total: int) -> bool:
url = track.get("audiodownload")
if not url:
print(f" [{index}/{total}] SKIP (no download URL): {track['name']}")
return False
if not track.get("audiodownload_allowed", True):
print(f" [{index}/{total}] SKIP (download not allowed): {track['name']}")
return False
print(f" [{index}/{total}] Downloading: {filepath.name}...", end=" ", flush=True)
resp = client.get(url, follow_redirects=True)
resp.raise_for_status()
filepath.write_bytes(resp.content)
size_mb = len(resp.content) / (1024 * 1024)
dur = track.get("duration", 0)
print(f"{size_mb:.1f} MB, {dur // 60}:{dur % 60:02d}")
return True
def save_credit(track: dict, filename: str):
artist = track.get("artist_name", "Unknown")
title = track.get("name", "Untitled")
license_url = track.get("license_ccurl", "")
share_url = track.get("shareurl", "")
line = f"{filename} | {artist} - {title} | {license_url} | {share_url}\n"
existing = CREDITS_FILE.read_text() if CREDITS_FILE.exists() else ""
if filename not in existing:
with open(CREDITS_FILE, "a") as f:
if not existing:
f.write("# Music Credits (Jamendo - Creative Commons)\n")
f.write("# File | Artist - Title | License | URL\n\n")
f.write(line)
def main():
parser = argparse.ArgumentParser(description="Download instrumental music from Jamendo")
parser.add_argument("--genre", choices=GENRES, help="Download only this genre")
parser.add_argument("--count", type=int, default=20, help="Total tracks to download (default: 20)")
parser.add_argument("--list", action="store_true", help="List available tracks without downloading")
args = parser.parse_args()
client_id = get_client_id()
MUSIC_DIR.mkdir(exist_ok=True)
genres = [args.genre] if args.genre else GENRES
per_genre = max(1, args.count // len(genres))
remainder = args.count - per_genre * len(genres)
all_tracks = []
seen_ids = set()
with httpx.Client(timeout=30) as api_client:
for i, genre in enumerate(genres):
limit = per_genre + (1 if i < remainder else 0)
if limit <= 0:
continue
print(f"Searching {genre}...", end=" ", flush=True)
tracks = search_tracks(api_client, client_id, genre, limit)
# Deduplicate across genres
added = 0
for t in tracks:
if t["id"] not in seen_ids and added < limit:
t["_genre_tag"] = genre
all_tracks.append(t)
seen_ids.add(t["id"])
added += 1
print(f"{added} tracks")
if not all_tracks:
print("No tracks found.")
return
if args.list:
print(f"\n{'#':<4} {'Genre':<10} {'Artist':<25} {'Title':<40} {'Duration':<8}")
print("-" * 90)
for i, t in enumerate(all_tracks, 1):
dur = f"{t['duration'] // 60}:{t['duration'] % 60:02d}"
artist = t["artist_name"][:24]
title = t["name"][:39]
label = GENRE_LABELS.get(t["_genre_tag"], t["_genre_tag"])
print(f"{i:<4} {label:<10} {artist:<25} {title:<40} {dur:<8}")
print(f"\n{len(all_tracks)} tracks available")
return
# Download phase
downloaded = 0
skipped_exists = 0
skipped_error = 0
with httpx.Client(timeout=120, follow_redirects=True) as dl_client:
for i, track in enumerate(all_tracks, 1):
filename = make_filename(track, track["_genre_tag"])
filepath = MUSIC_DIR / filename
if filepath.exists():
print(f" [{i}/{len(all_tracks)}] EXISTS: {filename}")
skipped_exists += 1
continue
try:
if download_track(dl_client, track, filepath, i, len(all_tracks)):
save_credit(track, filename)
downloaded += 1
else:
skipped_error += 1
except Exception as e:
print(f" [{i}/{len(all_tracks)}] ERROR: {e}")
# Clean up partial download
if filepath.exists():
filepath.unlink()
skipped_error += 1
print(f"\nDone: {downloaded} downloaded, {skipped_exists} existed, {skipped_error} skipped")
if __name__ == "__main__":
main()
+391 -23
View File
@@ -113,6 +113,69 @@ header button:hover {
border-color: rgba(232, 121, 29, 0.3); border-color: rgba(232, 121, 29, 0.3);
} }
.theme-bar {
display: flex;
align-items: center;
gap: 6px;
padding: 4px 12px;
background: rgba(255, 255, 255, 0.05);
border-radius: 6px;
}
.theme-label {
font-size: 0.8rem;
color: #aaa;
white-space: nowrap;
}
.theme-input {
background: rgba(255, 255, 255, 0.08);
border: 1px solid rgba(255, 255, 255, 0.15);
border-radius: 4px;
color: #fff;
padding: 4px 8px;
font-size: 0.85rem;
width: 200px;
}
.theme-input:focus {
outline: none;
border-color: #f5a623;
}
.theme-input.active {
border-color: #f5a623;
background: rgba(245, 166, 35, 0.1);
}
.theme-btn {
padding: 4px 10px;
border-radius: 4px;
border: none;
cursor: pointer;
font-size: 0.8rem;
}
.theme-btn.set {
background: #f5a623;
color: #000;
}
.theme-btn.set:hover {
background: #e6991a;
}
.theme-btn.clear {
background: rgba(255, 255, 255, 0.1);
color: #aaa;
padding: 4px 6px;
}
.theme-btn.clear:hover {
background: rgba(255, 80, 80, 0.3);
color: #ff5050;
}
.on-air-btn { .on-air-btn {
font-weight: 700; font-weight: 700;
text-transform: uppercase; text-transform: uppercase;
@@ -284,9 +347,14 @@ section h2 {
} }
.caller-btn.active { .caller-btn.active {
background: var(--accent); background: var(--bg);
border-color: var(--accent); border-color: transparent;
}
.caller-btn.active .caller-name {
color: #fff; color: #fff;
background: var(--accent);
padding: 2px 8px;
border-radius: 4px;
} }
.call-status { .call-status {
@@ -400,6 +468,84 @@ section h2 {
line-height: 1.3; line-height: 1.3;
} }
/* Caller model indicator */
.info-badge.model {
background: rgba(100, 140, 220, 0.2);
color: #7ab0e8;
font-size: 0.7rem;
cursor: pointer;
}
.caller-model-override {
font-size: 0.7rem;
padding: 2px 4px;
background: var(--bg);
color: var(--text);
border: 1px solid rgba(100, 140, 220, 0.3);
border-radius: 4px;
max-width: 140px;
}
/* Caller button model badge */
.model-tag {
font-size: 0.55rem;
color: #7ab0e8;
background: rgba(100, 140, 220, 0.15);
padding: 0 3px;
border-radius: 2px;
font-weight: 700;
letter-spacing: 0.3px;
flex-shrink: 0;
}
/* Caller Models settings section */
.caller-model-row {
margin-bottom: 8px;
}
.caller-model-row label {
margin-bottom: 0;
}
.cm-pool-input {
font-size: 0.8rem;
}
.cm-style-grid {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 4px;
margin-bottom: 8px;
max-height: 200px;
overflow-y: auto;
}
.cm-style-item {
display: flex;
align-items: center;
justify-content: space-between;
gap: 4px;
background: rgba(255, 255, 255, 0.05);
border-radius: 4px;
padding: 3px 6px;
}
.cm-style-name {
font-size: 0.7rem;
color: var(--text-muted);
white-space: nowrap;
}
.cm-style-select {
font-size: 0.7rem;
padding: 2px 3px;
background: var(--bg);
color: var(--text);
border: 1px solid rgba(232, 121, 29, 0.15);
border-radius: 4px;
max-width: 110px;
}
.caller-background-full { .caller-background-full {
margin-top: 8px; margin-top: 8px;
font-size: 0.75rem; font-size: 0.75rem;
@@ -586,19 +732,6 @@ section h2 {
margin-bottom: 10px; margin-bottom: 10px;
} }
.music-section select optgroup {
color: var(--accent);
font-weight: bold;
font-style: normal;
padding: 4px 0;
}
.music-section select option {
color: var(--text);
font-weight: normal;
padding: 2px 8px;
}
.music-controls { .music-controls {
display: flex; display: flex;
gap: 8px; gap: 8px;
@@ -625,6 +758,83 @@ section h2 {
accent-color: var(--accent); accent-color: var(--accent);
} }
/* Genre Quick-Select */
.genre-section {
grid-column: span 3;
}
.genre-grid {
display: flex;
flex-wrap: wrap;
gap: 6px;
margin-bottom: 8px;
}
.genre-btn {
background: var(--bg);
color: var(--text);
border: 1px solid rgba(232, 121, 29, 0.12);
padding: 6px 12px;
border-radius: var(--radius-sm);
cursor: pointer;
font-size: 0.8rem;
transition: all 0.15s;
white-space: nowrap;
}
.genre-btn:hover {
border-color: var(--accent);
background: #2a1e10;
color: #fff;
}
.genre-btn.active {
background: var(--accent);
border-color: var(--accent);
color: #fff;
font-weight: 600;
}
.now-playing {
display: flex;
align-items: center;
gap: 8px;
padding: 4px 0;
}
.now-playing-text {
font-size: 0.75rem;
color: var(--text-muted);
flex: 0 1 auto;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
min-width: 0;
}
.now-playing-stop {
background: var(--bg);
color: var(--text);
border: 1px solid rgba(232, 121, 29, 0.15);
padding: 4px 10px;
border-radius: var(--radius-sm);
cursor: pointer;
font-size: 0.75rem;
flex-shrink: 0;
transition: all 0.15s;
}
.now-playing-stop:hover {
border-color: var(--accent);
background: #2a1e10;
}
.now-playing-volume {
width: 80px;
flex-shrink: 0;
accent-color: var(--accent);
}
/* Soundboard */ /* Soundboard */
.sounds-section { .sounds-section {
grid-column: span 2; grid-column: span 2;
@@ -771,7 +981,7 @@ section h2 {
padding: 24px; padding: 24px;
border-radius: var(--radius); border-radius: var(--radius);
width: 90%; width: 90%;
max-width: 400px; max-width: 550px;
border: 1px solid rgba(232, 121, 29, 0.15); border: 1px solid rgba(232, 121, 29, 0.15);
} }
@@ -802,22 +1012,63 @@ section h2 {
flex: 1; flex: 1;
} }
.channel-row { .routing-grid {
display: flex; display: grid;
gap: 12px; grid-template-columns: repeat(3, 1fr);
margin-top: 8px; gap: 6px;
} }
.channel-row label { .routing-item {
display: flex; display: flex;
align-items: center; align-items: center;
justify-content: space-between;
gap: 4px; gap: 4px;
font-size: 0.85rem; background: rgba(255, 255, 255, 0.05);
border-radius: 6px;
padding: 5px 8px;
}
.routing-item label {
margin-bottom: 0 !important;
}
.routing-label {
font-size: 0.75rem;
color: var(--text-muted, #9a8b78);
white-space: nowrap;
} }
.channel-input { .channel-input {
width: 50px !important; width: 40px !important;
text-align: center; text-align: center;
font-size: 0.8rem;
padding: 3px !important;
}
.model-routing-grid {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 8px;
}
.model-routing-item {
display: flex;
flex-direction: column;
gap: 3px;
}
.model-routing-item label {
margin-bottom: 0 !important;
}
.model-routing-label {
font-size: 0.75rem;
color: var(--text-muted, #9a8b78);
}
.model-select {
font-size: 0.8rem !important;
padding: 4px 6px !important;
} }
.modal-content label { .modal-content label {
@@ -1484,6 +1735,16 @@ section h2 {
font-size: 0.8rem; font-size: 0.8rem;
} }
.media-row .genre-section {
grid-column: span 3;
}
@media (max-width: 700px) {
.media-row .genre-section {
grid-column: span 1;
}
}
/* Devon (Intern) */ /* Devon (Intern) */
.message.devon { .message.devon {
border-left: 3px solid var(--devon); border-left: 3px solid var(--devon);
@@ -1673,3 +1934,110 @@ button:focus-visible {
.log-toggle-btn:hover { .log-toggle-btn:hover {
color: var(--text); color: var(--text);
} }
/* Preflight */
.preflight-btn {
background: rgba(90, 138, 60, 0.15);
color: var(--accent-green);
border: 1px solid rgba(90, 138, 60, 0.3);
}
.preflight-btn:hover {
background: rgba(90, 138, 60, 0.25);
}
.preflight-content {
max-width: 700px;
}
.preflight-status {
display: flex;
align-items: center;
gap: 10px;
padding: 12px 16px;
border-radius: var(--radius-sm);
margin-bottom: 16px;
font-weight: 700;
font-size: 1.1rem;
}
.preflight-status.pass { background: rgba(90, 138, 60, 0.15); color: var(--accent-green); }
.preflight-status.warn { background: rgba(232, 169, 29, 0.15); color: #e8a91d; }
.preflight-status.fail { background: rgba(204, 34, 34, 0.15); color: var(--accent-red); }
.preflight-status.loading { background: rgba(232, 121, 29, 0.1); color: var(--text-muted); }
.preflight-checks {
display: flex;
flex-direction: column;
gap: 12px;
max-height: 60vh;
overflow-y: auto;
}
.preflight-check {
background: var(--bg);
border: 1px solid rgba(232, 121, 29, 0.1);
border-radius: var(--radius-sm);
padding: 12px 16px;
}
.preflight-check-header {
display: flex;
justify-content: space-between;
align-items: center;
cursor: pointer;
user-select: none;
}
.preflight-check-name {
font-weight: 600;
font-size: 0.95rem;
}
.preflight-check-badge {
font-size: 0.75rem;
font-weight: 700;
padding: 2px 8px;
border-radius: 4px;
text-transform: uppercase;
}
.preflight-check-badge.pass { background: rgba(90, 138, 60, 0.2); color: var(--accent-green); }
.preflight-check-badge.warn { background: rgba(232, 169, 29, 0.2); color: #e8a91d; }
.preflight-check-badge.fail { background: rgba(204, 34, 34, 0.2); color: var(--accent-red); }
.preflight-check-badge.skip { background: rgba(154, 139, 120, 0.2); color: var(--text-muted); }
.preflight-check-details {
margin-top: 10px;
font-size: 0.85rem;
color: var(--text-muted);
display: none;
}
.preflight-check.open .preflight-check-details {
display: block;
}
.preflight-table {
width: 100%;
border-collapse: collapse;
margin-top: 8px;
}
.preflight-table th {
text-align: left;
color: var(--text-muted);
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
padding: 4px 8px;
border-bottom: 1px solid rgba(232, 121, 29, 0.1);
}
.preflight-table td {
padding: 4px 8px;
font-size: 0.8rem;
color: var(--text);
border-bottom: 1px solid rgba(232, 121, 29, 0.05);
}
.preflight-table tr.mismatch td { color: var(--accent-red); }
.preflight-table tr.connected td { color: var(--accent-green); }
.preflight-test-btn {
background: rgba(232, 121, 29, 0.15);
color: var(--accent);
border: 1px solid rgba(232, 121, 29, 0.3);
}
.preflight-test-btn:hover { background: rgba(232, 121, 29, 0.25); }
.preflight-test-btn.loading { opacity: 0.6; pointer-events: none; }
+117 -36
View File
@@ -4,7 +4,7 @@
<meta charset="UTF-8"> <meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Luke at The Roost</title> <title>Luke at The Roost</title>
<link rel="stylesheet" href="/css/style.css"> <link rel="stylesheet" href="/css/style.css?v=2">
</head> </head>
<body> <body>
<div id="app"> <div id="app">
@@ -15,8 +15,15 @@
<button id="rec-btn" class="rec-btn" title="Record stems for post-production">REC</button> <button id="rec-btn" class="rec-btn" title="Record stems for post-production">REC</button>
<button id="new-session-btn" class="new-session-btn">New Session</button> <button id="new-session-btn" class="new-session-btn">New Session</button>
<button id="export-session-btn">Export</button> <button id="export-session-btn">Export</button>
<button id="preflight-btn" class="preflight-btn">Preflight</button>
<button id="settings-btn">Settings</button> <button id="settings-btn">Settings</button>
</div> </div>
<div class="theme-bar">
<label for="show-theme-input" class="theme-label">Theme:</label>
<input type="text" id="show-theme-input" class="theme-input" placeholder="e.g. St. Patrick's Day" maxlength="100">
<button id="set-theme-btn" class="theme-btn set" title="Set show theme">Set</button>
<button id="clear-theme-btn" class="theme-btn clear hidden" title="Clear theme">&#x2715;</button>
</div>
<div id="show-clock" class="show-clock"> <div id="show-clock" class="show-clock">
<span class="clock-time" id="clock-time"></span> <span class="clock-time" id="clock-time"></span>
<span id="show-timers" class="show-timers hidden"> <span id="show-timers" class="show-timers hidden">
@@ -69,6 +76,8 @@
<span id="caller-shape-badge" class="info-badge shape"></span> <span id="caller-shape-badge" class="info-badge shape"></span>
<span id="caller-energy-badge" class="info-badge energy"></span> <span id="caller-energy-badge" class="info-badge energy"></span>
<span id="caller-emotion" class="info-badge emotion"></span> <span id="caller-emotion" class="info-badge emotion"></span>
<span id="caller-model-badge" class="info-badge model"></span>
<select id="caller-model-override" class="caller-model-override hidden"></select>
</div> </div>
<div id="caller-signature" class="caller-signature"></div> <div id="caller-signature" class="caller-signature"></div>
<div id="caller-situation" class="caller-situation"></div> <div id="caller-situation" class="caller-situation"></div>
@@ -134,13 +143,13 @@
<!-- Music / Ads / Idents --> <!-- Music / Ads / Idents -->
<div class="media-row"> <div class="media-row">
<section class="music-section"> <section class="music-section genre-section">
<h2>Music</h2> <h2>Music <span class="shortcut-label">M</span></h2>
<select id="track-select"></select> <div id="genre-buttons" class="genre-grid"></div>
<div class="music-controls"> <div id="now-playing" class="now-playing hidden">
<button id="play-btn">Play <span class="shortcut-label">M</span></button> <span id="now-playing-text" class="now-playing-text"></span>
<button id="stop-btn">Stop</button> <button id="stop-btn" class="now-playing-stop">Stop</button>
<input type="range" id="volume" min="0" max="100" value="30"> <input type="range" id="volume" min="0" max="100" value="30" class="now-playing-volume">
</div> </div>
</section> </section>
@@ -195,7 +204,7 @@
<!-- Audio Devices --> <!-- Audio Devices -->
<div class="settings-group"> <div class="settings-group">
<h3>Audio Routing</h3> <h3>Audio Devices</h3>
<div class="device-row"> <div class="device-row">
<label> <label>
Input Device Input Device
@@ -203,7 +212,7 @@
</label> </label>
<label> <label>
Ch Ch
<input type="number" id="input-channel" value="1" min="1" max="16" class="channel-input"> <input type="number" id="input-channel" value="1" min="1" max="32" class="channel-input">
</label> </label>
</div> </div>
<div class="device-row"> <div class="device-row">
@@ -212,44 +221,100 @@
<select id="output-device"></select> <select id="output-device"></select>
</label> </label>
</div> </div>
<div class="channel-row"> </div>
<label>Caller Ch <input type="number" id="caller-channel" value="3" min="1" max="16" class="channel-input"></label>
<label>Live Ch <input type="number" id="live-caller-channel" value="9" min="1" max="16" class="channel-input"></label> <div class="settings-group">
<label>Music Ch <input type="number" id="music-channel" value="5" min="1" max="16" class="channel-input"></label> <h3>Output Routing</h3>
<label>SFX Ch <input type="number" id="sfx-channel" value="7" min="1" max="16" class="channel-input"></label> <div class="routing-grid">
<label>Ad Ch <input type="number" id="ad-channel" value="11" min="1" max="16" class="channel-input"></label> <div class="routing-item">
<label>Ident Ch <input type="number" id="ident-channel" value="15" min="1" max="16" class="channel-input"></label> <span class="routing-label">AI Caller</span>
<input type="number" id="caller-channel" value="3" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">Devon</span>
<input type="number" id="devon-channel" value="17" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">Live Caller</span>
<input type="number" id="live-caller-channel" value="9" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">Music</span>
<input type="number" id="music-channel" value="5" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">SFX</span>
<input type="number" id="sfx-channel" value="7" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">Ads</span>
<input type="number" id="ad-channel" value="11" min="1" max="32" class="channel-input">
</div>
<div class="routing-item">
<span class="routing-label">Idents</span>
<input type="number" id="ident-channel" value="15" min="1" max="32" class="channel-input">
</div>
</div> </div>
</div> </div>
<!-- LLM Settings --> <!-- LLM Settings -->
<div class="settings-group"> <div class="settings-group">
<h3>LLM Provider</h3> <h3>LLM Model Routing</h3>
<label> <div class="model-routing-grid">
Provider <div class="model-routing-item">
<select id="provider"> <span class="model-routing-label">Caller Dialog</span>
<option value="openrouter">OpenRouter</option> <select id="model-caller_dialog" class="model-select"></select>
<option value="ollama">Ollama</option> </div>
</select> <div class="model-routing-item">
</label> <span class="model-routing-label">Devon Monitor</span>
<select id="model-devon_monitor" class="model-select"></select>
</div>
<div class="model-routing-item">
<span class="model-routing-label">Devon Ask</span>
<select id="model-devon_ask" class="model-select"></select>
</div>
<div class="model-routing-item">
<span class="model-routing-label">Backgrounds</span>
<select id="model-background_gen" class="model-select"></select>
</div>
<div class="model-routing-item">
<span class="model-routing-label">Call Summary</span>
<select id="model-call_summary" class="model-select"></select>
</div>
<div class="model-routing-item">
<span class="model-routing-label">News</span>
<select id="model-news_summary" class="model-select"></select>
</div>
</div>
</div>
<div id="openrouter-settings"> <!-- Caller Model Routing -->
<div class="settings-group">
<h3>Caller Models</h3>
<div class="caller-model-row">
<label> <label>
Model Strategy
<select id="openrouter-model"></select> <select id="cm-strategy">
<option value="single">Single Model</option>
<option value="cycle">Cycle Models</option>
<option value="style_matched">Style-Matched</option>
</select>
</label> </label>
</div> </div>
<div id="cm-pool-section" class="hidden">
<div id="ollama-settings" class="hidden">
<label> <label>
Model Model Pool
<select id="ollama-model"></select> <input type="text" id="cm-pool" class="cm-pool-input" placeholder="x-ai/grok-4, deepseek/deepseek-v3.2, ...">
</label> </label>
</div>
<div id="cm-style-map" class="hidden">
<div class="cm-style-grid" id="cm-style-grid"></div>
</div>
<div class="caller-model-row">
<label> <label>
Host Fallback Model
<input type="text" id="ollama-host" value="http://localhost:11434"> <select id="cm-fallback" class="model-select"></select>
</label> </label>
<button type="button" id="refresh-ollama" class="refresh-btn">Refresh Models</button>
</div> </div>
</div> </div>
@@ -293,8 +358,24 @@
</div> </div>
</div> </div>
</div> </div>
<!-- Preflight Modal -->
<div id="preflight-modal" class="modal hidden">
<div class="modal-content preflight-content">
<h2>Show Preflight</h2>
<div id="preflight-status" class="preflight-status loading">
<span class="preflight-status-icon">...</span>
<span class="preflight-status-text">Running checks...</span>
</div>
<div id="preflight-checks" class="preflight-checks"></div>
<div class="modal-buttons">
<button id="preflight-test-btn" class="preflight-test-btn">Test Responses</button>
<button id="preflight-rerun-btn">Re-run</button>
<button id="close-preflight">Close</button>
</div>
</div>
</div>
</div> </div>
<script src="/js/app.js?v=22"></script> <script src="/js/app.js?v=27"></script>
</body> </body>
</html> </html>
+757 -111
View File
File diff suppressed because it is too large Load Diff
+142 -114
View File
@@ -23,6 +23,8 @@ import tempfile
import xml.etree.ElementTree as ET import xml.etree.ElementTree as ET
from pathlib import Path from pathlib import Path
import time
import requests import requests
from dotenv import load_dotenv from dotenv import load_dotenv
@@ -46,6 +48,50 @@ WIDTH = 1080
HEIGHT = 1920 HEIGHT = 1920
def _llm_request(prompt: str, max_tokens: int = 2048, temperature: float = 0.3,
timeout: int = 60) -> str | None:
"""Make an LLM API call with timeout and retry. Returns content or None on failure."""
for attempt in range(2):
try:
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions",
headers={
"Authorization": f"Bearer {OPENROUTER_API_KEY}",
"Content-Type": "application/json",
},
json={
"model": "anthropic/claude-sonnet-4-5",
"messages": [{"role": "user", "content": prompt}],
"max_tokens": max_tokens,
"temperature": temperature,
},
timeout=timeout,
)
if response.status_code != 200:
print(f" LLM error (HTTP {response.status_code}): {response.text[:200]}")
if attempt == 0:
print(f" Retrying in 5s...")
time.sleep(5)
continue
return None
return response.json()["choices"][0]["message"]["content"].strip()
except requests.Timeout:
print(f" LLM request timed out ({timeout}s)")
if attempt == 0:
print(f" Retrying in 5s...")
time.sleep(5)
continue
return None
except Exception as e:
print(f" LLM request failed: {e}")
if attempt == 0:
print(f" Retrying in 5s...")
time.sleep(5)
continue
return None
return None
def _build_whisper_prompt(labeled_transcript: str) -> str: def _build_whisper_prompt(labeled_transcript: str) -> str:
"""Build an initial_prompt for Whisper from the labeled transcript. """Build an initial_prompt for Whisper from the labeled transcript.
@@ -186,7 +232,12 @@ def refine_clip_timestamps(audio_path: str, clips: list[dict],
"ffmpeg", "-y", "-ss", str(seg_start), "-t", str(seg_end - seg_start), "ffmpeg", "-y", "-ss", str(seg_start), "-t", str(seg_end - seg_start),
"-i", audio_path, "-ar", "16000", "-ac", "1", seg_path, "-i", audio_path, "-ar", "16000", "-ac", "1", seg_path,
] ]
result = subprocess.run(cmd, capture_output=True, text=True) try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
except subprocess.TimeoutExpired:
print(f" Clip {i+1}: ffmpeg timed out (120s), skipping")
refined[i] = []
continue
if result.returncode != 0: if result.returncode != 0:
print(f" Clip {i+1}: Failed to extract segment") print(f" Clip {i+1}: Failed to extract segment")
refined[i] = [] refined[i] = []
@@ -279,25 +330,11 @@ IMPORTANT:
Respond with ONLY a JSON array, no markdown or explanation: Respond with ONLY a JSON array, no markdown or explanation:
[{{"title": "...", "start_time": 0.0, "end_time": 0.0, "caption_text": "..."}}]""" [{{"title": "...", "start_time": 0.0, "end_time": 0.0, "caption_text": "..."}}]"""
response = requests.post( content = _llm_request(prompt, max_tokens=2048, temperature=0.3, timeout=60)
"https://openrouter.ai/api/v1/chat/completions", if content is None:
headers={ print(" Failed to get clip selections from LLM — aborting")
"Authorization": f"Bearer {OPENROUTER_API_KEY}", return []
"Content-Type": "application/json",
},
json={
"model": "anthropic/claude-sonnet-4-5",
"messages": [{"role": "user", "content": prompt}],
"max_tokens": 2048,
"temperature": 0.3,
},
)
if response.status_code != 200:
print(f"Error from OpenRouter: {response.text}")
sys.exit(1)
content = response.json()["choices"][0]["message"]["content"].strip()
if content.startswith("```"): if content.startswith("```"):
content = re.sub(r"^```(?:json)?\n?", "", content) content = re.sub(r"^```(?:json)?\n?", "", content)
content = re.sub(r"\n?```$", "", content) content = re.sub(r"\n?```$", "", content)
@@ -307,7 +344,7 @@ Respond with ONLY a JSON array, no markdown or explanation:
except json.JSONDecodeError as e: except json.JSONDecodeError as e:
print(f"Error parsing LLM response: {e}") print(f"Error parsing LLM response: {e}")
print(f"Response was: {content[:500]}") print(f"Response was: {content[:500]}")
sys.exit(1) return []
# Validate and clamp durations # Validate and clamp durations
validated = [] validated = []
@@ -349,25 +386,11 @@ For each clip, generate:
Respond with ONLY a JSON array matching the clip order: Respond with ONLY a JSON array matching the clip order:
[{{"description": "...", "hashtags": ["#tag1", "#tag2", ...]}}]""" [{{"description": "...", "hashtags": ["#tag1", "#tag2", ...]}}]"""
response = requests.post( content = _llm_request(prompt, max_tokens=2048, temperature=0.7, timeout=60)
"https://openrouter.ai/api/v1/chat/completions", if content is None:
headers={ print(" Failed to generate social metadata — skipping")
"Authorization": f"Bearer {OPENROUTER_API_KEY}",
"Content-Type": "application/json",
},
json={
"model": "anthropic/claude-sonnet-4-5",
"messages": [{"role": "user", "content": prompt}],
"max_tokens": 2048,
"temperature": 0.7,
},
)
if response.status_code != 200:
print(f"Error from OpenRouter: {response.text}")
return clips return clips
content = response.json()["choices"][0]["message"]["content"].strip()
if content.startswith("```"): if content.startswith("```"):
content = re.sub(r"^```(?:json)?\n?", "", content) content = re.sub(r"^```(?:json)?\n?", "", content)
content = re.sub(r"\n?```$", "", content) content = re.sub(r"\n?```$", "", content)
@@ -777,43 +800,25 @@ RULES:
RAW TEXT ({len(words)} words): RAW TEXT ({len(words)} words):
{raw_text}""" {raw_text}"""
try: polished = _llm_request(prompt, max_tokens=2048, temperature=0, timeout=30)
response = requests.post( if polished is None:
"https://openrouter.ai/api/v1/chat/completions", print(f" Polish failed, using raw text")
headers={ return words
"Authorization": f"Bearer {OPENROUTER_API_KEY}",
"Content-Type": "application/json",
},
json={
"model": "anthropic/claude-sonnet-4-5",
"messages": [{"role": "user", "content": prompt}],
"max_tokens": 2048,
"temperature": 0,
},
timeout=30,
)
if response.status_code != 200:
print(f" Polish failed ({response.status_code}), using raw text")
return words
polished = response.json()["choices"][0]["message"]["content"].strip() polished_words = polished.split()
polished_words = polished.split()
if len(polished_words) != len(words): if len(polished_words) != len(words):
print(f" Polish word count mismatch ({len(polished_words)} vs {len(words)}), using raw text") print(f" Polish word count mismatch ({len(polished_words)} vs {len(words)}), using raw text")
return words return words
changes = 0 changes = 0
for i, pw in enumerate(polished_words): for i, pw in enumerate(polished_words):
if pw != words[i]["word"]: if pw != words[i]["word"]:
changes += 1 changes += 1
words[i]["word"] = pw words[i]["word"] = pw
if changes: if changes:
print(f" Polished {changes} words") print(f" Polished {changes} words")
except Exception as e:
print(f" Polish error: {e}")
return words return words
@@ -898,8 +903,12 @@ def extract_clip_audio(audio_path: str, start: float, end: float,
output_path, output_path,
] ]
result = subprocess.run(cmd, capture_output=True, text=True) try:
return result.returncode == 0 result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
return result.returncode == 0
except subprocess.TimeoutExpired:
print(f" ffmpeg audio extraction timed out (120s)")
return False
def generate_background_image(episode_number: int, clip_title: str, def generate_background_image(episode_number: int, clip_title: str,
@@ -1153,7 +1162,11 @@ def generate_clip_video(audio_path: str, background_path: str,
output_path, output_path,
] ]
result = subprocess.run(cmd, capture_output=True, text=True) try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
except subprocess.TimeoutExpired:
print(f" ffmpeg video generation timed out (300s)")
return False
if result.returncode != 0: if result.returncode != 0:
print(f" ffmpeg error: {result.stderr[-300:]}") print(f" ffmpeg error: {result.stderr[-300:]}")
return False return False
@@ -1235,7 +1248,12 @@ def generate_clip_video_remotion(
output_path, output_path,
] ]
result = subprocess.run(cmd, capture_output=True, text=True, cwd=str(REMOTION_DIR)) try:
result = subprocess.run(cmd, capture_output=True, text=True, cwd=str(REMOTION_DIR), timeout=180)
except subprocess.TimeoutExpired:
props_path.unlink(missing_ok=True)
print(f" Remotion render timed out (180s)")
return False
props_path.unlink(missing_ok=True) props_path.unlink(missing_ok=True)
if result.returncode != 0: if result.returncode != 0:
@@ -1488,6 +1506,9 @@ def main():
print(f"\n[3/{step_total}] Selecting {args.count} best moments with LLM...") print(f"\n[3/{step_total}] Selecting {args.count} best moments with LLM...")
clips = select_clips_with_llm(transcript_text, labeled_transcript, clips = select_clips_with_llm(transcript_text, labeled_transcript,
chapters_json, args.count) chapters_json, args.count)
if not clips:
print("\nNo clips selected — aborting.")
return
# Snap to sentence boundaries so clips don't start/end mid-sentence # Snap to sentence boundaries so clips don't start/end mid-sentence
clips = snap_to_sentences(clips, segments) clips = snap_to_sentences(clips, segments)
@@ -1524,14 +1545,18 @@ def main():
extract_step = 6 if two_pass else 5 extract_step = 6 if two_pass else 5
print(f"\n[{extract_step}/{step_total}] Extracting audio clips...") print(f"\n[{extract_step}/{step_total}] Extracting audio clips...")
for i, clip in enumerate(clips): for i, clip in enumerate(clips):
print(f" [{i+1}/{len(clips)}] \"{clip['title']}\"...")
slug = slugify(clip["title"]) slug = slugify(clip["title"])
mp3_path = output_dir / f"clip-{i+1}-{slug}.mp3" mp3_path = output_dir / f"clip-{i+1}-{slug}.mp3"
if extract_clip_audio(str(audio_path), clip["start_time"], clip["end_time"], try:
str(mp3_path)): if extract_clip_audio(str(audio_path), clip["start_time"], clip["end_time"],
print(f" Clip {i+1} audio: {mp3_path.name}") str(mp3_path)):
else: print(f" Clip {i+1} audio: {mp3_path.name}")
print(f" Error extracting clip {i+1} audio") else:
print(f" Error extracting clip {i+1} audio — skipping")
except Exception as e:
print(f" Clip {i+1} audio failed: {e} — skipping")
video_step = 7 if two_pass else 6 video_step = 7 if two_pass else 6
if args.audio_only: if args.audio_only:
@@ -1553,49 +1578,52 @@ def main():
mp4_path = output_dir / f"clip-{i+1}-{slug}.mp4" mp4_path = output_dir / f"clip-{i+1}-{slug}.mp4"
duration = clip["end_time"] - clip["start_time"] duration = clip["end_time"] - clip["start_time"]
print(f" Clip {i+1}: Generating video...") print(f" [{i+1}/{len(clips)}] \"{clip['title']}\" ({duration:.0f}s)...")
# Get word timestamps — use refined segments if available try:
word_source = refined[i] if (two_pass and i in refined and refined[i]) else segments # Get word timestamps — use refined segments if available
clip_words = get_words_in_range(word_source, clip["start_time"], clip["end_time"]) word_source = refined[i] if (two_pass and i in refined and refined[i]) else segments
clip_words = get_words_in_range(word_source, clip["start_time"], clip["end_time"])
# Add speaker labels # Add speaker labels
clip_words = add_speaker_labels(clip_words, labeled_transcript, clip_words = add_speaker_labels(clip_words, labeled_transcript,
clip["start_time"], clip["end_time"], clip["start_time"], clip["end_time"],
word_source) word_source)
# Polish text with LLM (fix punctuation, capitalization, mishearings) # Polish text with LLM (fix punctuation, capitalization, mishearings)
clip_words = polish_clip_words(clip_words, labeled_transcript) clip_words = polish_clip_words(clip_words, labeled_transcript)
# Group words into timed caption lines # Group words into timed caption lines
caption_lines = group_words_into_lines( caption_lines = group_words_into_lines(
clip_words, clip["start_time"], duration clip_words, clip["start_time"], duration
) )
if use_remotion: if use_remotion:
if generate_clip_video_remotion( if generate_clip_video_remotion(
str(mp3_path), caption_lines, clip["start_time"], str(mp3_path), caption_lines, clip["start_time"],
clip["title"], episode_number, str(mp4_path), duration clip["title"], episode_number, str(mp4_path), duration
): ):
file_size = mp4_path.stat().st_size / (1024 * 1024) file_size = mp4_path.stat().st_size / (1024 * 1024)
print(f" Clip {i+1} video: {mp4_path.name} ({file_size:.1f} MB)") print(f" Clip {i+1} video: {mp4_path.name} ({file_size:.1f} MB)")
else:
print(f" Clip {i+1} video failed (Remotion) — skipping")
else: else:
print(f" Error generating clip {i+1} video (Remotion)") # Legacy PIL+ffmpeg renderer
else: bg_path = str(tmp_dir / f"bg_{i}.png")
# Legacy PIL+ffmpeg renderer generate_background_image(episode_number, clip["title"], bg_path)
bg_path = str(tmp_dir / f"bg_{i}.png")
generate_background_image(episode_number, clip["title"], bg_path)
clip_tmp = tmp_dir / f"clip_{i}" clip_tmp = tmp_dir / f"clip_{i}"
clip_tmp.mkdir(exist_ok=True) clip_tmp.mkdir(exist_ok=True)
if generate_clip_video(str(mp3_path), bg_path, caption_lines, if generate_clip_video(str(mp3_path), bg_path, caption_lines,
clip["start_time"], str(mp4_path), clip["start_time"], str(mp4_path),
duration, clip_tmp): duration, clip_tmp):
file_size = mp4_path.stat().st_size / (1024 * 1024) file_size = mp4_path.stat().st_size / (1024 * 1024)
print(f" Clip {i+1} video: {mp4_path.name} ({file_size:.1f} MB)") print(f" Clip {i+1} video: {mp4_path.name} ({file_size:.1f} MB)")
else: else:
print(f" Error generating clip {i+1} video") print(f" Clip {i+1} video failed (ffmpeg) — skipping")
except Exception as e:
print(f" Clip {i+1} video failed: {e} — skipping")
# Save clips metadata for social upload # Save clips metadata for social upload
metadata_path = output_dir / "clips-metadata.json" metadata_path = output_dir / "clips-metadata.json"
+2 -2
View File
@@ -17,7 +17,7 @@ from pathlib import Path
import numpy as np import numpy as np
import soundfile as sf import soundfile as sf
STEM_NAMES = ["host", "caller", "music", "sfx", "ads", "idents"] STEM_NAMES = ["host", "caller", "devon", "music", "sfx", "ads", "idents"]
def load_stems(stems_dir: Path) -> tuple[dict[str, np.ndarray], int]: def load_stems(stems_dir: Path) -> tuple[dict[str, np.ndarray], int]:
@@ -69,7 +69,7 @@ def remove_gaps(stems: dict[str, np.ndarray], sr: int,
# Detect gaps in everything except music (which always plays). # Detect gaps in everything except music (which always plays).
# This catches TTS latency gaps while protecting ad breaks and SFX transitions. # This catches TTS latency gaps while protecting ad breaks and SFX transitions.
content = stems["host"] + stems["caller"] + stems["sfx"] + stems["ads"] + stems["idents"] content = stems["host"] + stems["caller"] + stems["devon"] + stems["sfx"] + stems["ads"] + stems["idents"]
rms = compute_rms(content, window_samples) rms = compute_rms(content, window_samples)
# Threshold: percentile-based to sit above the mic noise floor # Threshold: percentile-based to sit above the mic noise floor
+132 -44
View File
@@ -19,7 +19,8 @@ import shutil
import subprocess import subprocess
import sys import sys
import tempfile import tempfile
from datetime import datetime, timezone import time
from datetime import datetime, timedelta, timezone
from pathlib import Path from pathlib import Path
import ssl import ssl
@@ -303,7 +304,7 @@ TRANSCRIPT:
{timestamped_text} {timestamped_text}
Generate a JSON response with: Generate a JSON response with:
1. "title": A catchy episode title (include "Episode {episode_number}:" prefix) 1. "title": An episode title with "Episode {episode_number}:" prefix. The title MUST reference something SPECIFIC from this episode a caller's name, their situation, a memorable quote, or a specific moment. Good titles sound like you're telling a friend what happened: "Episode 12: Gary's Goat Problem and the Worst Best Man Speech Ever", "Episode 8: The Lawnmower Feud, a Cursed Wedding Ring, and Darla Finally Calls Back". Bad titles are generic and could apply to any podcast episode: "Secrets and Confessions", "Late Night Tales", "Wild Stories and Hot Takes". Avoid the words: secrets, confessions, tales, chronicles, diaries, unfiltered, raw, real talk.
2. "description": A 2-4 sentence description summarizing the episode's content. Mention callers by name and their topics. End with something engaging. 2. "description": A 2-4 sentence description summarizing the episode's content. Mention callers by name and their topics. End with something engaging.
3. "chapters": An array of chapter objects with "startTime" (in seconds) and "title". Include: 3. "chapters": An array of chapter objects with "startTime" (in seconds) and "title". Include:
- "Intro" at 0 seconds - "Intro" at 0 seconds
@@ -1081,9 +1082,79 @@ def upload_image_to_postiz(image_path: str) -> dict | None:
return None return None
def post_to_social(metadata: dict, episode_slug: str, image_path: str = None): def _build_platform_content(metadata: dict, episode_url: str, yt_url: str | None,
platform: str) -> str:
"""Generate platform-tailored social post content for episode announcements."""
title = metadata["title"]
desc = metadata["description"]
if platform == "x":
hook = desc.split(". ")[0] + "."
content = f"{hook}\n\n{episode_url}\n\n#LukeAtTheRoost #podcast"
if len(content) > 280:
content = f"{title}\n\n{episode_url}"[:280]
elif platform == "instagram":
hashtags = ("#podcast #LukeAtTheRoost #talkradio #callinshow #newepisode "
"#podcastlife #podcastrecommendations #comedy #advice "
"#latenightradio #aipodcast #talkshow")
content = f"New episode 🎙️\n\n{desc}\n\nLink in bio.\n\n{hashtags}"
elif platform == "threads":
content = (f"{title}\n\n{desc}\n\nlukeattheroost.com"
f"\n\n#podcast #LukeAtTheRoost #newepisode #callinshow")
elif platform == "bluesky":
content = f"{desc}\n\n{episode_url}"
if len(content) > 300:
avail = 300 - len(episode_url) - 2
content = desc[:avail].rsplit(" ", 1)[0] + "\n\n" + episode_url
elif platform == "mastodon":
content = f"{title}\n\n{desc}\n\n{episode_url}"
if yt_url:
content += f"\n{yt_url}"
elif platform == "linkedin":
content = f"{title}\n\n{desc}"
content += f"\n\nListen: {episode_url}"
if yt_url:
content += f"\nWatch: {yt_url}"
elif platform == "facebook":
content = f"New episode just dropped 🎙️\n\n{desc}\n\nListen free: {episode_url}"
if yt_url:
content += f"\nWatch: {yt_url}"
elif platform == "tiktok":
hook = desc.split(". ")[0] + "."
content = (f"New episode: {hook}"
f"\n\n#podcast #LukeAtTheRoost #callinshow #newepisode #fyp")
elif platform == "nostr":
content = f"{title}\n\n{desc}\n\n{episode_url}"
if yt_url:
content += f"\n{yt_url}"
else:
content = f"{title}\n\n{desc}\n\n{episode_url}"
return content
# Platforms that post immediately vs scheduled (minutes offset from publish time)
_IMMEDIATE_PLATFORMS = {"x", "bluesky"}
_SCHEDULE_OFFSETS = {
"instagram": 30, "threads": 30,
"facebook": 60, "linkedin": 60,
"tiktok": 90, "mastodon": 120, "nostr": 120,
}
def post_to_social(metadata: dict, episode_slug: str, image_path: str = None,
yt_video_id: str = None):
"""Post episode announcement to all connected social channels via Postiz.""" """Post episode announcement to all connected social channels via Postiz."""
print("[5.5/5] Posting to social media...") print("[5.7] Posting to social media...")
token = _get_postiz_token() token = _get_postiz_token()
@@ -1095,31 +1166,17 @@ def post_to_social(metadata: dict, episode_slug: str, image_path: str = None):
image_ids = [{"id": media["id"], "path": media.get("path", "")}] image_ids = [{"id": media["id"], "path": media.get("path", "")}]
episode_url = f"https://lukeattheroost.com/episode.html?slug={episode_slug}" episode_url = f"https://lukeattheroost.com/episode.html?slug={episode_slug}"
base_content = f"{metadata['title']}\n\n{metadata['description']}\n\n{episode_url}" yt_url = f"https://youtube.com/watch?v={yt_video_id}" if yt_video_id else None
now = datetime.now(timezone.utc)
hashtags = "#podcast #LukeAtTheRoost #talkradio #callinshow #newepisode"
hashtag_platforms = {"instagram", "facebook", "bluesky", "mastodon", "nostr", "linkedin", "threads", "tiktok"}
# Platform-specific content length limits
PLATFORM_MAX_LENGTH = {"bluesky": 300, "threads": 500, "tiktok": 2200}
# Post to each platform individually so one failure doesn't block others # Post to each platform individually so one failure doesn't block others
posted = 0 posted = 0
for platform, intg_config in POSTIZ_INTEGRATIONS.items(): for platform, intg_config in POSTIZ_INTEGRATIONS.items():
content = base_content content = _build_platform_content(metadata, episode_url, yt_url, platform)
if platform in hashtag_platforms:
content += f"\n\n{hashtags}"
# Truncate for platforms with short limits settings = {"__type": platform, "post_type": "post"}
max_len = PLATFORM_MAX_LENGTH.get(platform) if platform == "x":
if max_len and len(content) > max_len: settings["who_can_reply_post"] = "everyone"
# Keep title + URL, truncate description
short = f"{metadata['title']}\n\n{episode_url}"
if platform in hashtag_platforms:
short += f"\n\n{hashtags}"
content = short[:max_len]
settings = {"post_type": "post"}
if "channel" in intg_config: if "channel" in intg_config:
settings["channel"] = intg_config["channel"] settings["channel"] = intg_config["channel"]
@@ -1129,30 +1186,46 @@ def post_to_social(metadata: dict, episode_slug: str, image_path: str = None):
"settings": settings, "settings": settings,
} }
# Stagger: immediate for fast-moving platforms, scheduled for rest
offset_min = _SCHEDULE_OFFSETS.get(platform, 0)
if platform in _IMMEDIATE_PLATFORMS or offset_min == 0:
post_type = "now"
post_date = now.strftime("%Y-%m-%dT%H:%M:%S.000Z")
else:
post_type = "schedule"
scheduled = now + timedelta(minutes=offset_min)
post_date = scheduled.strftime("%Y-%m-%dT%H:%M:%S.000Z")
payload = { payload = {
"type": "now", "type": post_type,
"shortLink": False, "shortLink": False,
"date": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.000Z"), "date": post_date,
"tags": [], "tags": [],
"posts": [post], "posts": [post],
} }
try: # Retry once on failure (2 attempts, 5s backoff)
resp = requests.post( for attempt in range(2):
f"{POSTIZ_URL}/api/posts", try:
headers={"auth": token, "Content-Type": "application/json"}, resp = requests.post(
json=payload, f"{POSTIZ_URL}/api/posts",
timeout=60, headers={"auth": token, "Content-Type": "application/json"},
) json=payload,
if resp.status_code in (200, 201): timeout=60,
posted += 1 )
print(f" Posted to {platform}") if resp.status_code in (200, 201):
else: posted += 1
print(f" Warning: {platform} failed ({resp.status_code}): {resp.text[:150]}") label = f"scheduled +{offset_min}m" if post_type == "schedule" else "posted"
except Exception as e: print(f" {platform}: {label}")
print(f" Warning: {platform} failed: {e}") break
else:
print(f" Warning: {platform} attempt {attempt + 1} failed ({resp.status_code}): {resp.text[:150]}")
except Exception as e:
print(f" Warning: {platform} attempt {attempt + 1} failed: {e}")
if attempt < 1:
time.sleep(5)
print(f" Posted to {posted}/{len(POSTIZ_INTEGRATIONS)} channels") print(f" Posted/scheduled {posted}/{len(POSTIZ_INTEGRATIONS)} channels")
def get_youtube_service(): def get_youtube_service():
@@ -1199,6 +1272,21 @@ def _check_youtube_duplicate(youtube, title: str) -> str | None:
return None return None
def _extract_youtube_tags(metadata: dict) -> list[str]:
"""Extract dynamic tags from episode metadata for YouTube SEO."""
base_tags = ["podcast", "Luke at the Roost", "talk radio", "call-in show",
"talk show", "comedy", "AI podcast", "late night radio", "advice"]
skip = {"intro", "outro", "opening", "closing", "wrap up", "wrap-up"}
dynamic = []
for ch in metadata.get("chapters", []):
title = ch.get("title", "").strip()
if title.lower() in skip or len(title) < 3:
continue
if len(title) <= 50:
dynamic.append(title)
return (base_tags + dynamic)[:25]
def upload_to_youtube(audio_path: str, metadata: dict, chapters: list, def upload_to_youtube(audio_path: str, metadata: dict, chapters: list,
episode_slug: str) -> str | None: episode_slug: str) -> str | None:
"""Convert audio to video with cover art, upload to YouTube, add to podcast playlist.""" """Convert audio to video with cover art, upload to YouTube, add to podcast playlist."""
@@ -1257,8 +1345,7 @@ def upload_to_youtube(audio_path: str, metadata: dict, chapters: list,
"snippet": { "snippet": {
"title": metadata["title"][:100], "title": metadata["title"][:100],
"description": description, "description": description,
"tags": ["podcast", "Luke at the Roost", "talk radio", "call-in show", "tags": _extract_youtube_tags(metadata),
"talk show", "comedy"],
"categoryId": "22", "categoryId": "22",
}, },
"status": { "status": {
@@ -1629,7 +1716,8 @@ def main():
else: else:
social_image_path = str(audio_path.with_suffix(".social.jpg")) social_image_path = str(audio_path.with_suffix(".social.jpg"))
generate_social_image(episode_number, metadata["description"], social_image_path) generate_social_image(episode_number, metadata["description"], social_image_path)
post_to_social(metadata, episode["slug"], social_image_path) post_to_social(metadata, episode["slug"], social_image_path,
yt_video_id=yt_video_id)
_mark_step_done(episode_number, "social") _mark_step_done(episode_number, "social")
# Step 6: Summary # Step 6: Summary
+227 -70
View File
@@ -9,15 +9,19 @@
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
local SILENCE_DB = -30 -- dBFS — anything below this is "silence" local SILENCE_DB = -30 -- dBFS — anything below this is "silence"
local MIN_SILENCE_SEC = 6.0 -- same-speaker gaps: only remove silences longer than this local MIN_SILENCE_SEC = 6.0 -- same-speaker gaps: only remove silences longer than this
local MIN_SILENCE_TRANSITION_SEC = 2.5 -- cross-speaker gaps: shorter threshold for speaker transitions local MAX_SILENCE_SEC = 999 -- no practical limit (IDENT/AD regions protect real breaks)
local MIN_SILENCE_TRANSITION_SEC = 5.0 -- cross-speaker gaps: threshold for caller TTS latency
local MIN_SILENCE_DEVON_SEC = 3.0 -- Devon gaps: interjections are prerendered (~2-3s gaps), conversational TTS is 6s+
local DEVON_TRACK = 2 -- 1-indexed: Devon track number
local MIN_VOICE_SEC = 0.3 -- ignore non-silent bursts shorter than this (filters transients) local MIN_VOICE_SEC = 0.3 -- ignore non-silent bursts shorter than this (filters transients)
local KEEP_PAD_SEC = 0.5 -- leave this much silence on each side of a cut local KEEP_PAD_SEC = 0.5 -- leave this much silence on each side of a cut
local BLOCK_SEC = 0.1 -- analysis block size (100ms) local BLOCK_SEC = 0.1 -- analysis block size (100ms)
local SAMPLE_RATE = 48000 local SAMPLE_RATE = 48000
local CHECK_TRACKS = {1, 2, 3} -- 1-indexed: Host, Live Caller, AI Caller local CHECK_TRACKS = {1, 2, 3, 4} -- 1-indexed: Host, Devon, AI Caller, Live Caller
local IDENTS_TRACK = 5 -- 1-indexed: Idents track local SFX_TRACK = 5 -- 1-indexed: SFX track
local ADS_TRACK = 6 -- 1-indexed: Ads track local IDENTS_TRACK = 6 -- 1-indexed: Idents track
local MUSIC_TRACK = 7 -- 1-indexed: Music track local ADS_TRACK = 7 -- 1-indexed: Ads track
local MUSIC_TRACK = 8 -- 1-indexed: Music track
local MUSIC_FADE_SEC = 2.0 -- fade duration for music in/out around ads/idents local MUSIC_FADE_SEC = 2.0 -- fade duration for music in/out around ads/idents
local YIELD_INTERVAL = 200 -- yield to REAPER every N blocks (~20s of audio) local YIELD_INTERVAL = 200 -- yield to REAPER every N blocks (~20s of audio)
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
@@ -25,7 +29,6 @@ local YIELD_INTERVAL = 200 -- yield to REAPER every N blocks (~20s of audio)
local BLOCK_SAMPLES = math.floor(SAMPLE_RATE * BLOCK_SEC) local BLOCK_SAMPLES = math.floor(SAMPLE_RATE * BLOCK_SEC)
local THRESHOLD = 10 ^ (SILENCE_DB / 20) local THRESHOLD = 10 ^ (SILENCE_DB / 20)
local MIN_VOICE_BLOCKS = math.ceil(MIN_VOICE_SEC / BLOCK_SEC) local MIN_VOICE_BLOCKS = math.ceil(MIN_VOICE_SEC / BLOCK_SEC)
local function log(msg) local function log(msg)
reaper.ShowConsoleMsg("[PostProd] " .. msg .. "\n") reaper.ShowConsoleMsg("[PostProd] " .. msg .. "\n")
end end
@@ -306,13 +309,17 @@ local function read_block_peak_rms(ta, project_time)
end end
-- find_loudest_track: returns 1-based index of the loudest track at a given time, or 0 if silent -- find_loudest_track: returns 1-based index of the loudest track at a given time, or 0 if silent
-- Uses RMS (not peak) for speaker identification — ambient mic noise has high peaks but low RMS
local function find_loudest_track(track_audios, project_time) local function find_loudest_track(track_audios, project_time)
local best_peak = 0 local best_peak = 0
local best_rms = 0
local best_idx = 0 local best_idx = 0
for i, ta in ipairs(track_audios) do for i, ta in ipairs(track_audios) do
local peak, _ = read_block_peak_rms(ta, project_time) local peak, sum_sq = read_block_peak_rms(ta, project_time)
if peak > best_peak then if peak > best_peak then best_peak = peak end
best_peak = peak local rms = math.sqrt(sum_sq / BLOCK_SAMPLES)
if rms > best_rms then
best_rms = rms
best_idx = i best_idx = i
end end
end end
@@ -340,12 +347,17 @@ local function find_silences(region, track_audios, rms_acc, progress_fn)
while t < region.end_pos do while t < region.end_pos do
local best_peak = 0 local best_peak = 0
local best_rms = 0
local best_sum = 0 local best_sum = 0
local best_track = 0 local best_track = 0
for i, ta in ipairs(track_audios) do for i, ta in ipairs(track_audios) do
local peak, sum_sq = read_block_peak_rms(ta, t) local peak, sum_sq = read_block_peak_rms(ta, t)
if peak > best_peak then if peak > best_peak then best_peak = peak end
best_peak = peak -- Use RMS for speaker identification (sustained energy, not transient peaks)
-- Host mic ambient noise has high peaks but low RMS; TTS speech has high RMS
local rms = math.sqrt(sum_sq / BLOCK_SAMPLES)
if rms > best_rms then
best_rms = rms
best_sum = sum_sq best_sum = sum_sq
best_track = i best_track = i
end end
@@ -375,8 +387,11 @@ local function find_silences(region, track_audios, rms_acc, progress_fn)
local dur = voice_start - silence_start local dur = voice_start - silence_start
local track_after = voice_run_track local track_after = voice_run_track
local is_transition = track_before_silence ~= 0 and track_after ~= 0 and track_before_silence ~= track_after local is_transition = track_before_silence ~= 0 and track_after ~= 0 and track_before_silence ~= track_after
local threshold = is_transition and MIN_SILENCE_TRANSITION_SEC or MIN_SILENCE_SEC local devon_involved = track_before_silence == DEVON_TRACK or track_after == DEVON_TRACK
if dur >= threshold then local threshold = devon_involved and MIN_SILENCE_DEVON_SEC
or (is_transition and MIN_SILENCE_TRANSITION_SEC or MIN_SILENCE_SEC)
if dur >= threshold and dur <= MAX_SILENCE_SEC then
table.insert(silences, { table.insert(silences, {
start_pos = silence_start, end_pos = voice_start, duration = dur, start_pos = silence_start, end_pos = voice_start, duration = dur,
is_transition = is_transition, is_transition = is_transition,
@@ -410,7 +425,7 @@ local function find_silences(region, track_audios, rms_acc, progress_fn)
if in_silence then if in_silence then
local dur = region.end_pos - silence_start local dur = region.end_pos - silence_start
if dur >= MIN_SILENCE_SEC then if dur >= MIN_SILENCE_SEC and dur <= MAX_SILENCE_SEC then
table.insert(silences, {start_pos = silence_start, end_pos = region.end_pos, duration = dur}) table.insert(silences, {start_pos = silence_start, end_pos = region.end_pos, duration = dur})
end end
end end
@@ -452,7 +467,10 @@ local function phase1_strip_silence(dialog_regions)
for _, r in ipairs(get_regions_by_type("^IDENT%s+%d+$")) do table.insert(protected_regions, r) end for _, r in ipairs(get_regions_by_type("^IDENT%s+%d+$")) do table.insert(protected_regions, r) end
table.sort(protected_regions, function(a, b) return a.start_pos < b.start_pos end) table.sort(protected_regions, function(a, b) return a.start_pos < b.start_pos end)
if #protected_regions > 0 then if #protected_regions > 0 then
log(" Protecting " .. #protected_regions .. " AD/IDENT region(s) from silence removal") log(" Protecting " .. #protected_regions .. " AD/IDENT region(s) from silence removal:")
for _, pr in ipairs(protected_regions) do
log(" " .. pr.name .. " at " .. string.format("%.1f", pr.start_pos) .. "-" .. string.format("%.1f", pr.end_pos) .. "s")
end
end end
log("Phase 1: Analyzing using " .. tracks_loaded .. "/" .. #CHECK_TRACKS .. " voice tracks") log("Phase 1: Analyzing using " .. tracks_loaded .. "/" .. #CHECK_TRACKS .. " voice tracks")
@@ -498,6 +516,11 @@ local function phase1_strip_silence(dialog_regions)
break break
end end
end end
-- Preserve the very first silence (music intro before host starts talking)
if not protected and ri == 1 and #removals == 0 and s.start_pos <= rgn.start_pos + 1.0 then
protected = true
log(" KEEP " .. string.format("%.1f", rm_end - rm_start) .. "s at " .. string.format("%.1f", s.start_pos) .. "-" .. string.format("%.1f", s.end_pos) .. " (music intro)")
end
if not protected then if not protected then
table.insert(removals, {start_pos = rm_start, end_pos = rm_end}) table.insert(removals, {start_pos = rm_start, end_pos = rm_end})
local tag = s.is_transition and " [transition]" or "" local tag = s.is_transition and " [transition]" or ""
@@ -588,64 +611,88 @@ end
-- Phase 2: Normalize AD/IDENT volume to match dialog -- Phase 2: Normalize AD/IDENT volume to match dialog
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
local function normalize_track_regions(track_idx, regions, target_db) local function normalize_track_items(track_idx, target_db, label)
-- Normalize all items on a track that have audible content.
-- Uses direct WAV reading (not audio accessor) so it works after Phase 1 splits.
local track = reaper.GetTrack(0, track_idx - 1) local track = reaper.GetTrack(0, track_idx - 1)
if not track or reaper.CountTrackMediaItems(track) == 0 then return end if not track then
log(" " .. label .. ": track " .. track_idx .. " does not exist")
return
end
for _, rgn in ipairs(regions) do local item_count = reaper.CountTrackMediaItems(track)
local item = find_item_at(track, rgn.start_pos) log(" " .. label .. ": " .. item_count .. " item(s) on track " .. track_idx)
if not item then goto next_region end if item_count == 0 then return end
local item_start = reaper.GetMediaItemInfo_Value(item, "D_POSITION") local ta = get_track_audio(track_idx)
if not ta then
log(" " .. label .. ": get_track_audio() returned nil — no readable WAV sources")
return
end
log(" " .. label .. ": " .. #ta.segments .. " WAV segment(s), span " .. string.format("%.1f", ta.item_pos) .. "-" .. string.format("%.1f", ta.item_end) .. "s")
local segment = item local adjusted = 0
if item_start < rgn.start_pos - 0.01 then local skipped_silent = 0
segment = reaper.SplitMediaItem(item, rgn.start_pos) local skipped_small = 0
if not segment then goto next_region end for i = 0, item_count - 1 do
end local item = reaper.GetTrackMediaItem(track, i)
local seg_end = reaper.GetMediaItemInfo_Value(segment, "D_POSITION") local item_pos = reaper.GetMediaItemInfo_Value(item, "D_POSITION")
+ reaper.GetMediaItemInfo_Value(segment, "D_LENGTH") local item_len = reaper.GetMediaItemInfo_Value(item, "D_LENGTH")
if rgn.end_pos < seg_end - 0.01 then local item_end = item_pos + item_len
reaper.SplitMediaItem(segment, rgn.end_pos)
end
local take = reaper.GetActiveTake(segment)
if not take then goto next_region end
local seg_pos = reaper.GetMediaItemInfo_Value(segment, "D_POSITION")
local seg_len = reaper.GetMediaItemInfo_Value(segment, "D_LENGTH")
local seg_offset = reaper.GetMediaItemTakeInfo_Value(take, "D_STARTOFFS")
local accessor = reaper.CreateTakeAudioAccessor(take)
-- Measure RMS of audible content in this item
local sum_sq = 0 local sum_sq = 0
local count = 0 local count = 0
local t = seg_pos local total_blocks = 0
while t < seg_pos + seg_len do local t = item_pos
local source_time = t - seg_pos + seg_offset while t < item_end do
local buf = reaper.new_array(BLOCK_SAMPLES) local peak, s_sq = read_block_peak_rms(ta, t)
reaper.GetAudioAccessorSamples(accessor, SAMPLE_RATE, 1, source_time, BLOCK_SAMPLES, buf) total_blocks = total_blocks + 1
for i = 1, BLOCK_SAMPLES do if peak >= THRESHOLD then
sum_sq = sum_sq + buf[i] * buf[i] sum_sq = sum_sq + s_sq
count = count + BLOCK_SAMPLES
end end
count = count + BLOCK_SAMPLES
t = t + BLOCK_SEC t = t + BLOCK_SEC
end end
reaper.DestroyAudioAccessor(accessor)
local audible_blocks = math.floor(count / BLOCK_SAMPLES)
if count > 0 then if count > 0 then
local item_rms = math.sqrt(sum_sq / count) local item_rms = math.sqrt(sum_sq / count)
if item_rms > 0 then if item_rms > 0 then
local item_db = 20 * math.log(item_rms, 10) local item_db = 20 * math.log(item_rms, 10)
local gain_db = target_db - item_db local gain_db = target_db - item_db
local gain_linear = 10 ^ (gain_db / 20) local current_vol = reaper.GetMediaItemInfo_Value(item, "D_VOL")
local current_vol = reaper.GetMediaItemInfo_Value(segment, "D_VOL") log(" " .. label .. " item " .. (i+1) .. "/" .. item_count
reaper.SetMediaItemInfo_Value(segment, "D_VOL", current_vol * gain_linear) .. " pos=" .. string.format("%.1f", item_pos) .. "s"
log(" " .. rgn.name .. ": " .. string.format("%+.1f", gain_db) .. "dB adjustment") .. " len=" .. string.format("%.1f", item_len) .. "s"
.. " blocks=" .. total_blocks .. "/" .. audible_blocks .. " audible"
.. " RMS=" .. string.format("%.1f", item_db) .. "dB"
.. " target=" .. string.format("%.1f", target_db) .. "dB"
.. " gain=" .. string.format("%+.1f", gain_db) .. "dB"
.. " vol=" .. string.format("%.3f", current_vol))
-- Only adjust if the difference is significant (> 1dB)
if math.abs(gain_db) > 1.0 then
local gain_linear = 10 ^ (gain_db / 20)
reaper.SetMediaItemInfo_Value(item, "D_VOL", current_vol * gain_linear)
log(" -> APPLIED: vol " .. string.format("%.3f", current_vol) .. " -> " .. string.format("%.3f", current_vol * gain_linear))
adjusted = adjusted + 1
else
log(" -> SKIPPED: gain within 1dB threshold")
skipped_small = skipped_small + 1
end
end end
else
log(" " .. label .. " item " .. (i+1) .. "/" .. item_count
.. " pos=" .. string.format("%.1f", item_pos) .. "s"
.. " len=" .. string.format("%.1f", item_len) .. "s"
.. " blocks=" .. total_blocks
.. " — NO AUDIBLE BLOCKS (all below " .. SILENCE_DB .. "dB)")
skipped_silent = skipped_silent + 1
end end
::next_region::
end end
destroy_track_audio(ta)
log(" " .. label .. " RESULT: " .. adjusted .. " adjusted, " .. skipped_small .. " within 1dB, " .. skipped_silent .. " silent")
end end
local function normalize_music_track(dialog_regions, target_db) local function normalize_music_track(dialog_regions, target_db)
@@ -728,27 +775,35 @@ local function phase2_normalize(dialog_regions, ad_regions, ident_regions, dialo
end end
log("Phase 2: Dialog RMS = " .. string.format("%.1f", dialog_rms_db) .. " dBFS") log("Phase 2: Dialog RMS = " .. string.format("%.1f", dialog_rms_db) .. " dBFS")
local dialog_db = dialog_rms_db
if #ad_regions > 0 then -- Ads/idents are pre-compressed dense audio, so they sound louder than dialog
progress_detail = "Ads" -- at the same RMS. Target a few dB below dialog to match perceived loudness.
coroutine.yield() local AD_IDENT_OFFSET_DB = -4
log("Phase 2: Normalizing " .. #ad_regions .. " AD region(s)...") local ad_ident_target = dialog_rms_db + AD_IDENT_OFFSET_DB
normalize_track_regions(ADS_TRACK, ad_regions, dialog_db) log("Phase 2: AD/IDENT target = " .. string.format("%.1f", ad_ident_target) .. " dBFS (" .. AD_IDENT_OFFSET_DB .. "dB offset from dialog)")
end
if #ident_regions > 0 then progress_detail = "Ads"
progress_detail = "Idents" coroutine.yield()
progress_pct = 0.33 log("Phase 2: Normalizing ads track...")
coroutine.yield() normalize_track_items(ADS_TRACK, ad_ident_target, "Ads")
log("Phase 2: Normalizing " .. #ident_regions .. " IDENT region(s)...")
normalize_track_regions(IDENTS_TRACK, ident_regions, dialog_db) progress_detail = "Idents"
end progress_pct = 0.25
coroutine.yield()
log("Phase 2: Normalizing idents track...")
normalize_track_items(IDENTS_TRACK, ad_ident_target, "Idents")
progress_detail = "SFX"
progress_pct = 0.50
coroutine.yield()
log("Phase 2: Normalizing SFX track...")
normalize_track_items(SFX_TRACK, ad_ident_target, "SFX")
progress_detail = "Music" progress_detail = "Music"
progress_pct = 0.66 progress_pct = 0.75
coroutine.yield() coroutine.yield()
log("Phase 2: Normalizing music track...") log("Phase 2: Normalizing music track...")
normalize_music_track(dialog_regions, dialog_db) normalize_music_track(dialog_regions, dialog_rms_db)
progress_pct = 1.0 progress_pct = 1.0
end end
@@ -766,6 +821,75 @@ local function phase3_trim_music()
local music_track = reaper.GetTrack(0, MUSIC_TRACK - 1) local music_track = reaper.GetTrack(0, MUSIC_TRACK - 1)
if not music_track then return end if not music_track then return end
-- Music lead-in: ensure audible music plays before first voice.
-- Strategy: skip the silent intro in the music WAV (adjust take offset),
-- then nudge all non-music tracks forward by MUSIC_LEAD_SEC so music plays first.
local MUSIC_LEAD_SEC = 3.0
-- Find where music becomes audible in the source WAV
local music_audible_offset = nil
local music_ta = get_track_audio(MUSIC_TRACK)
if music_ta then
local t = music_ta.item_pos
while t < music_ta.item_end do
local peak, _ = read_block_peak_rms(music_ta, t)
if peak >= THRESHOLD then
music_audible_offset = t - music_ta.item_pos -- offset into the WAV
break
end
t = t + BLOCK_SEC
end
destroy_track_audio(music_ta)
end
if false then -- Music lead-in disabled — intro silence is preserved instead
-- Skip the silent intro: set take offset so audible music starts at position 0
local first_music = reaper.GetTrackMediaItem(music_track, 0)
if first_music then
local take = reaper.GetActiveTake(first_music)
if take then
local current_offset = reaper.GetMediaItemTakeInfo_Value(take, "D_STARTOFFS")
reaper.SetMediaItemTakeInfo_Value(take, "D_STARTOFFS", current_offset + music_audible_offset)
-- Trim item length to account for skipped intro
local item_len = reaper.GetMediaItemInfo_Value(first_music, "D_LENGTH")
reaper.SetMediaItemInfo_Value(first_music, "D_LENGTH", item_len - music_audible_offset)
log("Phase 3: Skipped " .. string.format("%.1f", music_audible_offset) .. "s of silent music intro")
end
end
-- Nudge all non-music tracks forward by MUSIC_LEAD_SEC
log("Phase 3: Nudging non-music tracks forward by " .. MUSIC_LEAD_SEC .. "s for music lead-in")
for t = 0, reaper.CountTracks(0) - 1 do
if (t + 1) == MUSIC_TRACK then goto skip_music end
local track = reaper.GetTrack(0, t)
for i = 0, reaper.CountTrackMediaItems(track) - 1 do
local item = reaper.GetTrackMediaItem(track, i)
local pos = reaper.GetMediaItemInfo_Value(item, "D_POSITION")
reaper.SetMediaItemInfo_Value(item, "D_POSITION", pos + MUSIC_LEAD_SEC)
end
::skip_music::
end
-- Shift markers/regions forward too
local markers_to_update = {}
local _, num_markers, num_regions = reaper.CountProjectMarkers(0)
for i = 0, num_markers + num_regions - 1 do
local retval, is_region, pos, rgnend, name, idx, color = reaper.EnumProjectMarkers3(0, i)
if retval then
table.insert(markers_to_update, {is_region=is_region, pos=pos, rgnend=rgnend, name=name, idx=idx, color=color})
end
end
for _, m in ipairs(markers_to_update) do
if m.is_region then
reaper.SetProjectMarker3(0, m.idx, true, m.pos + MUSIC_LEAD_SEC, m.rgnend + MUSIC_LEAD_SEC, m.name, m.color)
else
reaper.SetProjectMarker3(0, m.idx, false, m.pos + MUSIC_LEAD_SEC, 0, m.name, m.color)
end
end
else
log("Phase 3: No silent music intro detected — skipping lead-in adjustment")
end
local last_end = 0 local last_end = 0
for _, tidx in ipairs(CHECK_TRACKS) do for _, tidx in ipairs(CHECK_TRACKS) do
local tr = reaper.GetTrack(0, tidx - 1) local tr = reaper.GetTrack(0, tidx - 1)
@@ -912,6 +1036,39 @@ local function do_work()
log("Phase 4: No AD/IDENT regions found — skipping") log("Phase 4: No AD/IDENT regions found — skipping")
end end
-- Set loop/time selection: start 0.5s before audible music, end at last item
local loop_start = 0
local music_ta = get_track_audio(MUSIC_TRACK)
if music_ta then
local t = music_ta.item_pos
while t < music_ta.item_end do
local peak, _ = read_block_peak_rms(music_ta, t)
if peak >= THRESHOLD then
loop_start = math.max(0, t - 0.5)
break
end
t = t + BLOCK_SEC
end
destroy_track_audio(music_ta)
end
local project_end = 0
for t = 0, reaper.CountTracks(0) - 1 do
local track = reaper.GetTrack(0, t)
local n = reaper.CountTrackMediaItems(track)
if n > 0 then
local last_item = reaper.GetTrackMediaItem(track, n - 1)
local item_end = reaper.GetMediaItemInfo_Value(last_item, "D_POSITION")
+ reaper.GetMediaItemInfo_Value(last_item, "D_LENGTH")
if item_end > project_end then project_end = item_end end
end
end
if project_end > 0 then
reaper.GetSet_LoopTimeRange(true, true, loop_start, project_end, false)
reaper.GetSet_LoopTimeRange(true, false, loop_start, project_end, false)
log("Loop range set: " .. string.format("%.1f", loop_start) .. " to " .. string.format("%.1f", project_end) .. "s (" .. string.format("%.1f", (project_end - loop_start) / 60) .. " min)")
end
reaper.PreventUIRefresh(-1) reaper.PreventUIRefresh(-1)
reaper.Undo_EndBlock("Post-production: strip silence + music fades", -1) reaper.Undo_EndBlock("Post-production: strip silence + music fades", -1)
reaper.UpdateArrange() reaper.UpdateArrange()
+122
View File
@@ -0,0 +1,122 @@
"""Scan music directory for tracks that contain vocals/lyrics.
Uses Whisper to transcribe a sample from each track if it picks up
actual words, the track likely has vocals.
Usage:
python scan_music_vocals.py # scan and report
python scan_music_vocals.py --delete # scan and delete vocal tracks
"""
import argparse
import sys
from pathlib import Path
import librosa
import numpy as np
from faster_whisper import WhisperModel
MUSIC_DIR = Path(__file__).parent / "music"
WHISPER_MODEL = "distil-large-v3"
# Words Whisper hallucinates on silence/instrumental — ignore these
HALLUCINATION_PHRASES = {
"thank you", "thanks for watching", "subscribe", "like and subscribe",
"please subscribe", "thank you for watching", "thanks for listening",
"you", "the end", "bye", "okay",
}
def scan_track(model: WhisperModel, filepath: Path) -> tuple[bool, str]:
"""Check a single track for vocals. Returns (has_vocals, transcription)."""
try:
audio, sr = librosa.load(str(filepath), sr=16000, mono=True)
except Exception as e:
return False, f"[load error: {e}]"
duration = len(audio) / sr
if duration < 10:
return False, "[too short]"
# Sample 30s from the middle (most likely to have vocals)
mid = len(audio) // 2
half_window = int(15 * sr) # 15s each side
start = max(0, mid - half_window)
end = min(len(audio), mid + half_window)
sample = audio[start:end]
segments, info = model.transcribe(
sample,
beam_size=3,
language="en",
vad_filter=True,
vad_parameters=dict(min_speech_duration_ms=500),
)
segments_list = list(segments)
text = " ".join(s.text for s in segments_list).strip()
# Filter out Whisper hallucinations
text_lower = text.lower().strip()
if text_lower in HALLUCINATION_PHRASES or len(text_lower) < 4:
return False, ""
# If Whisper found substantial text, it's likely vocals
word_count = len(text.split())
has_vocals = word_count >= 3
return has_vocals, text
def main():
parser = argparse.ArgumentParser(description="Scan music for vocal tracks")
parser.add_argument("--delete", action="store_true", help="Delete tracks with vocals")
args = parser.parse_args()
audio_files = sorted(
f for f in MUSIC_DIR.iterdir()
if f.suffix.lower() in {".mp3", ".wav", ".ogg", ".flac"}
)
if not audio_files:
print("No audio files found in music/")
return
print(f"Loading Whisper {WHISPER_MODEL}...")
model = WhisperModel(WHISPER_MODEL, device="cpu", compute_type="int8")
print(f"Scanning {len(audio_files)} tracks for vocals...\n")
vocal_tracks = []
for i, f in enumerate(audio_files, 1):
print(f"[{i}/{len(audio_files)}] {f.name}...", end=" ", flush=True)
has_vocals, text = scan_track(model, f)
if has_vocals:
print(f"VOCALS: {text[:80]}")
vocal_tracks.append((f, text))
else:
print("OK")
print(f"\n{'='*60}")
print(f"Results: {len(vocal_tracks)} tracks with vocals out of {len(audio_files)}\n")
if not vocal_tracks:
print("All tracks appear to be instrumental!")
return
for f, text in vocal_tracks:
print(f" {f.name}")
print(f" Lyrics: {text[:120]}")
print()
if args.delete:
print(f"Deleting {len(vocal_tracks)} vocal tracks...")
for f, _ in vocal_tracks:
f.unlink()
print(f" Deleted: {f.name}")
print("Done.")
else:
print("Run with --delete to remove these tracks.")
if __name__ == "__main__":
main()
+15
View File
@@ -23,6 +23,7 @@ load_dotenv(Path(__file__).parent / ".env")
POSTIZ_API_KEY = os.getenv("POSTIZ_API_KEY") POSTIZ_API_KEY = os.getenv("POSTIZ_API_KEY")
POSTIZ_URL = os.getenv("POSTIZ_URL", "https://social.lukeattheroost.com") POSTIZ_URL = os.getenv("POSTIZ_URL", "https://social.lukeattheroost.com")
POSTIZ_INTEGRATIONS = json.loads(os.getenv("POSTIZ_INTEGRATIONS", "{}"))
BSKY_HANDLE = os.getenv("BSKY_HANDLE", "lukeattheroost.bsky.social") BSKY_HANDLE = os.getenv("BSKY_HANDLE", "lukeattheroost.bsky.social")
BSKY_APP_PASSWORD = os.getenv("BSKY_APP_PASSWORD") BSKY_APP_PASSWORD = os.getenv("BSKY_APP_PASSWORD")
@@ -95,8 +96,22 @@ def fetch_integrations() -> list[dict]:
return resp.json() return resp.json()
BLOCKED_INTEGRATION_IDS = {
"cmluam50j0001o46xifujx059", # Personal LinkedIn (CareerPulse) — never post podcast content here
}
def find_integration(integrations: list[dict], provider: str) -> dict | None: def find_integration(integrations: list[dict], provider: str) -> dict | None:
# Prefer hardcoded integration ID from .env (avoids picking wrong account)
if provider in POSTIZ_INTEGRATIONS:
target_id = POSTIZ_INTEGRATIONS[provider].get("id")
if target_id:
for integ in integrations:
if integ.get("id") == target_id:
return integ
# Fallback: first matching provider (skip blocked accounts)
for integ in integrations: for integ in integrations:
if integ.get("id") in BLOCKED_INTEGRATION_IDS:
continue
if integ.get("identifier", "").startswith(provider) and not integ.get("disabled"): if integ.get("identifier", "").startswith(provider) and not integ.get("disabled"):
return integ return integ
return None return None