Fix live caller audio latency and choppiness
- Reduce capture chunk from 4096 to 640 samples (256ms → 40ms) - Replace BufferSource scheduling with AudioWorklet playback ring buffer - Add 80ms jitter buffer with linear interpolation upsampling - Reduce host mic and live caller stream blocksizes from 4096/2048 to 1024 - Replace librosa.resample with numpy interpolation in send_audio_to_caller Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -119,9 +119,12 @@ class CallerService:
|
||||
try:
|
||||
if sample_rate != 16000:
|
||||
import numpy as np
|
||||
import librosa
|
||||
audio = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32) / 32768.0
|
||||
audio = librosa.resample(audio, orig_sr=sample_rate, target_sr=16000)
|
||||
ratio = 16000 / sample_rate
|
||||
out_len = int(len(audio) * ratio)
|
||||
indices = (np.arange(out_len) / ratio).astype(int)
|
||||
indices = np.clip(indices, 0, len(audio) - 1)
|
||||
audio = audio[indices]
|
||||
pcm_data = (audio * 32767).astype(np.int16).tobytes()
|
||||
await ws.send_bytes(pcm_data)
|
||||
except Exception as e:
|
||||
|
||||
Reference in New Issue
Block a user