-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The best audio format supported by FastRTC? #138
Comments
Do you get an error with |
Thanks for quick response, @freddyaboulton. The following code block works when the value of the with tts_client.audio.speech.with_streaming_response.create(
model=settings.TTS_MODEL,
voice=settings.TTS_VOICE,
input=text,
response_format=settings.TTS_AUDIO_FORMAT,
extra_body={"backend": settings.TTS_BACKEND, "language": settings.LANGUAGE},
) as stream_audio:
# Iterate through all audio chunks in the stream
for i, audio_chunk in enumerate(stream_audio.iter_bytes(chunk_size=1024)):
print(f"Processing audio chunk {i}")
audio_array = np.frombuffer(audio_chunk, dtype=np.int16).reshape(1, -1)
yield (24000, audio_array) But when the value of the audio_array = np.frombuffer(audio_chunk, dtype=np.uint8).reshape(1, -1)
yield (24000, audio_array) (By the way, I don't know much about audio processing, the code above is suggested by ChatGPT.) Then, the app raise the following error:
Please advise. Thank you. |
Please use int16 or float32 not int8! Let me know if that works |
@freddyaboulton Unfortunately, neither option is working. I can only hear a faint, incomprehensible sound like "seh... seh...". Using int16 doesn't produce an error, but float32 raises: |
From my understanding, Opus is the best audio format supported by WebRTC, may I know if FastRTC supports it? If not, which audio format works best with FastRTC?
I'm using https://github.com/remsky/Kokoro-FastAPI?tab=readme-ov-file#features, it support the following audio formats:
The text was updated successfully, but these errors were encountered: