Ep13 publish, MLX whisper, voicemail system, hero redesign, massive topic expansion

- Switch whisper transcription from faster-whisper (CPU) to lightning-whisper-mlx (GPU) - Fix word_timestamps hanging, use ffprobe for accurate duration - Add Cloudflare Pages Worker for SignalWire voicemail fallback when server offline - Add voicemail sync on startup, delete tracking, save feature - Add /feed RSS proxy to _worker.js (was broken by worker taking over routing) - Redesign website hero section: ghost buttons, compact phone, plain text links - Rewrite caller prompts for faster point-getting and host-following - Expand TOPIC_CALLIN from ~250 to 547 entries across 34 categories - Add new categories: biology, psychology, engineering, math, geology, animals, work, money, books, movies, relationships, health, language, true crime, drunk/high/unhinged callers - Remove bad Inworld voices (Pixie, Dominus), reduce repeat caller frequency - Add audio monitor device routing, uvicorn --reload-dir fix - Publish episode 13 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:56:47 -07:00
parent 8d3d67a177
commit 3164a70e48
23 changed files with 2944 additions and 512 deletions
@@ -13,10 +13,8 @@ def get_whisper_model() -> WhisperModel:
    """Get or create Whisper model instance"""
    global _whisper_model
    if _whisper_model is None:
-        print("Loading Whisper tiny model for fast transcription...")
-        # Use tiny model for speed - about 3-4x faster than base
-        # beam_size=1 and best_of=1 for fastest inference
-        _whisper_model = WhisperModel("tiny", device="cpu", compute_type="int8")
+        print("Loading Whisper base model...")
+        _whisper_model = WhisperModel("base", device="cpu", compute_type="int8")
        print("Whisper model loaded")
    return _whisper_model

@@ -100,13 +98,13 @@ async def transcribe_audio(audio_data: bytes, source_sample_rate: int = None) ->
    else:
        audio_16k = audio

-    # Transcribe with speed optimizations
+    # Transcribe
    segments, info = model.transcribe(
        audio_16k,
-        beam_size=1,  # Faster, slightly less accurate
-        best_of=1,
-        language="en",  # Skip language detection
-        vad_filter=True,  # Skip silence
+        beam_size=3,
+        language="en",
+        vad_filter=True,
+        initial_prompt="Luke at the Roost, a late-night radio talk show. The host Luke talks to callers about life, relationships, sports, politics, and pop culture.",
    )
    segments_list = list(segments)
    text = " ".join([s.text for s in segments_list]).strip()