OpenAI
p/openai-api
APIs and tools for building AI products
Raghav Gupta

What are your thoughts on Openai's Realtime API?

Featured
6
•
I don't see a lot of products using the realtime api in building their conversation ai agents. Given that it now has realtime communication support through WebRTC allowing low latency conversations, I expected it to blow up. Are there any limitations of this model like hallucinations and or is it just too expensive for commercial use?
Replies
Best
Rajiv Ayyangar
Great question - maybe someone like @jordan_dearsley from @Vapi or @kwindla from @Daily.co / @pipecat could weigh in? I think the non-openAI infra for realtime AI has progressed really fast, so maybe there are just better, cheaper, more flexible building blocks outside of the Realtime API?
Kwindla Kramer
The OpenAI Realtime API and the Gemini Multimodal Live API are both very exciting. They are "speech-to-speech" APIs, which most of us in this space think are clearly the future! But both APIs are still in beta and are missing some features that are necessary today for most production/enterprise voice AI use cases: context management, configurable turn detection, proxying function calls to web hooks, ways to model conversations as state machines/transitions, etc. And the models behind these APIs are not quite as steerable or reliable as their "full" counterparts available via the HTTP inference APIs. In general, people are using orchestration frameworks like Pipecat to combine speech-to-text, LLM, and text-to-speech model capabilities. Voice AI deployments are getting a lot of traction with this architecture. Here are some detailed notes about the OpenAI Realtime API: https://www.latent.space/p/realt...
Rajiv Ayyangar
@kwindla thanks for weighing in - it does seem like a special moment for voice AI, where the capabilities of the core models are complemented by the recent progress in the orchestration layer. As a side note, it's cool to see how collaborative and fast-moving the companies in the space are. E.g. Vapi, Daily / Pipecat, Coval...it's a tight community which allows lots of speed while still having the building blocks fit together. At least that's my feeling as a somewhat outside observer.
Raghav Gupta
@kwindla Thanks for the answer 😊
Rajiv Ayyangar
@kwindla @raghav_gupta22 Have you tried building with the realtime api or other building blocks like Pipecat? Got a PH launch soon? :)
Aman Singh
@rajiv_ayyangar check http://speaktheglobe.com, I have implemented Realtime API with Google Maps