Replace token-based truncation with sentence-count limiting

- max_tokens back to 150 so LLM can finish thoughts
- New limit_sentences() keeps only first 2 complete sentences
- Never cuts mid-sentence — always ends at punctuation
- Applied to both chat and auto-respond paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-05 17:15:04 -07:00
parent 9c5f7c5cfe
commit 9d4b8a0d22
2 changed files with 24 additions and 17 deletions

View File

@@ -124,7 +124,7 @@ class LLMService:
json={
"model": self.openrouter_model,
"messages": messages,
"max_tokens": 75,
"max_tokens": 150,
},
)
response.raise_for_status()