Stream TTS audio to caller in real-time chunks

TTS audio was sent as a single huge WebSocket frame that overflowed the
browser's 3s ring buffer. Now streams in 60ms chunks at real-time rate.
Also increased browser ring buffer from 3s to 10s as safety net.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-05 16:56:22 -07:00
parent 97d37f3381
commit d4e25ceb88
4 changed files with 37 additions and 6 deletions

View File

@@ -673,11 +673,11 @@ async def text_to_speech(request: TTSRequest):
)
thread.start()
# Also send to active real callers so they hear the AI
# Also stream to active real callers so they hear the AI
if session.active_real_caller:
caller_id = session.active_real_caller["caller_id"]
asyncio.create_task(
caller_service.send_audio_to_caller(caller_id, audio_bytes, 24000)
caller_service.stream_audio_to_caller(caller_id, audio_bytes, 24000)
)
return {"status": "playing", "duration": len(audio_bytes) / 2 / 24000}