Files
ai-podcast/backend
tcpsyn aa3899b1fc Harden LLM: model fallback chain, reuse client, remove fighting timeouts
- Primary model gets 15s, then auto-falls back through gemini-flash,
  gpt-4o-mini, llama-3.1-8b (10s each)
- Always returns a response — canned in-character line as last resort
- Reuse httpx client instead of creating new one per request
- Remove asyncio.timeout wrappers that were killing requests before
  the LLM service could try fallbacks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:07:39 -07:00
..