Fix unnatural response cutoffs

- Replace aggressive sentence-count limiting with ensure_complete_thought()
  which only trims if the LLM was actually cut off mid-sentence
- Softer prompt guidance for natural brevity instead of rigid sentence count
- max_tokens at 100 as natural length cap

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-05 17:18:22 -07:00
parent 9d4b8a0d22
commit a1c94a3682
2 changed files with 16 additions and 27 deletions

View File

@@ -124,7 +124,7 @@ class LLMService:
json={
"model": self.openrouter_model,
"messages": messages,
"max_tokens": 150,
"max_tokens": 100,
},
)
response.raise_for_status()