Upgrade Whisper to distil-large-v3, fix caller identity confusion, sort clips list

- Whisper base → distil-large-v3 for much better live transcription accuracy
- Add context hints to transcription (caller name, screening status)
- Increase beam_size 3→5 for better decoding
- Add explicit role clarification in caller system prompt so LLM knows Luke is the host
- Prefix host messages with [Host Luke] in LLM conversation
- Fix upload_clips episode list sorting (natural numeric order)
- Episodes 26-28 transcripts, data updates, misc fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-05 12:46:51 -07:00
parent 6eeab58464
commit 0bdac16250
15 changed files with 1410 additions and 212 deletions

View File

@@ -10,6 +10,7 @@ Usage:
import argparse
import json
import re
import sys
from pathlib import Path
@@ -412,7 +413,7 @@ def main():
episode_dirs = sorted(
[d for d in clips_root.iterdir()
if d.is_dir() and not d.name.startswith(".") and (d / "clips-metadata.json").exists()],
key=lambda d: d.name,
key=lambda d: (int(m.group(1)) if (m := re.search(r'(\d+)', d.name)) else 0, d.name),
)
if not episode_dirs:
print("No clip directories found in clips/. Run make_clips.py first.")