Stream TTS audio to caller in real-time chunks

TTS audio was sent as a single huge WebSocket frame that overflowed the browser's 3s ring buffer. Now streams in 60ms chunks at real-time rate. Also increased browser ring buffer from 3s to 10s as safety net. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 16:56:22 -07:00
parent 97d37f3381
commit d4e25ceb88
4 changed files with 37 additions and 6 deletions
@@ -673,11 +673,11 @@ async def text_to_speech(request: TTSRequest):
    )
    thread.start()

-    # Also send to active real callers so they hear the AI
+    # Also stream to active real callers so they hear the AI
    if session.active_real_caller:
        caller_id = session.active_real_caller["caller_id"]
        asyncio.create_task(
-            caller_service.send_audio_to_caller(caller_id, audio_bytes, 24000)
+            caller_service.stream_audio_to_caller(caller_id, audio_bytes, 24000)
        )

    return {"status": "playing", "duration": len(audio_bytes) / 2 / 24000}