Sam Altman

DeepGram - Search engine for speech

Scott Stephenson
I'm the other founder of DeepGram. We'd love to hear what people think the best application is! We were particle physicists looking for Dark Matter before we started DeepGram. You can catch us totally geeking out on the technical side of doing search (what's the best way to find the signal above the noise?) and we're pretty sure that audio and video datasets are taking over the world.
Jonathan Libov
@stephensonsco @noahxc99 Who came up with the phrase "Sometimes I get depressed and go to airport to pick up other people's relatives"?
Scott Stephenson
@libovness @noajshu I wish that was me! For those not following, you are referring to the top video on (Mashily is a demo of our tech that lets you search through Donald Trump speeches, then edit clips to make him 'say anything'). That video was put together by some majestically brilliant reddit user a couple hours after we took Mashily live. We really should chase them down to get a drink ...
Jonathan Libov
@stephensonsco I have watched this video 100 times. I think it is among the weirdest phrases you could come up with in the English language, but it's made especially good by how his tone in "other people's relatives" makes the phrase sound perfectly sensible
Pedro Ruíz
Very interesting concept, what's the hardest challenge you've faced when dealing with accents? I can see an application in Education when searching online and being able to find results from KhanAcademy or TED right on my search without the need for transcription.
Noah Shutty
@piero_ruiz Definitely agree--navigating recorded lectures is a huge pain point we want to tackle. For DeepGram to handle accents, we had to train it with a large audio corpus created by many different speakers. Sometimes it's still helpful to 'type out' how it sounds--e.g.: 'wah ter mell un' instead of 'watermelon'
Scott Stephenson
@piero_ruiz TED and Khan academy are great applications! If you know some peeps working on those products, definitely send 'em our way ;). We started working on DeepGram while indexing our lives in college (@noajshu made a wearable -- lectures were a big impetus!) so we definitely see the value of being able to discover and navigate content like this.
Tom Pryor
@stephensonsco Tom from the product team at Khan Academy here. Definitely thought the same as @imranghory when I saw this - would be keen to chat about it more
Noah Shutty
@thomaspryor @stephensonsco @imranghory We'd be keen as well! I'll reach out shortly.
Noah Shutty
Hey PH! I'm one of the founders of DeepGram. Looking forward to some great discussion! We're always really excited to talk to people that handle a lot of audio (podcasts, video platforms, people with calls — we have an easy Twilio integration btw!)
priya joseph
@noajshu did u consider a slack or zapier integration as the twilio alternative?
@noajshu Hey Noah, there is a local company that does a similar thing however it takes a long time to process a video, so my question to you is: how long does it take to process a typical (5min or less) video with your platform
Noah Shutty
@ayirpelle great q--we ended up integrating directly with Twilio's API. It looks like Zapier only supports their SMS features ( and I can't find the Slack integration--could you point me in the right direction? Thanks!
Noah Shutty
@orliesaurus great question--we take about 2 minutes or less for a typical video, regardless of length. We're always working on making this faster though!
@noajshu so I went on to test the claim, I just uploaded a 30seconds clip and its been 30 minutes...still indexing?
Ryan Hoover
As more videos are created, there's a growing opportunity to indexing this content and make it searchable. @stephensonsco -- is doing some very interesting things in the podcasting space. You and @annewootton might be able to work together.
Adam Marx
@rrhoover @stephensonsco @annewootton I agree. The thing that's always fascinated me about video content indexing in particular is the ability to index based on content not searchable in the video's title or tags. Being able to search based on the mood or the actual content of the video (which is inherently more subjective than anything else) is something that I think holds a lot of opportunity in the future.
Noah Shutty
@adammarx13 @rrhoover @stephensonsco @annewootton Absolutely! We need content-aware search to navigate the ever-widening torrent of online media. Podcasts are a great example where the audio contains 1000s of words but there's little metadata. We're big fans of what is doing and would love to get in touch with @annewootton !
Jonathon Triest
QA/Compliance just got 100000x more efficient!
Noah Shutty
thanks @jtriest ! We aim to make audio 100000x more useful.
Jonathon Triest
@noajshu 1000000000000000000000000000000000000000x
Kartik Parija
Hey this is really interesting and given that we @adorilabs work on audio experiences; can totally see phenomenal use cases around search, annotation, highlighting etc within audio. This is really close to our heart, so putting this on our watchlist of potential use/collaboration. Congrats @noajshu @stephensonsco and team.
Noah Shutty
@kartikparija @adorilabs @stephensonsco Thanks Kartik! Checking out Adori now.
Kartik Parija
@noajshu @stephensonsco Oh our website is *ahem* vague. Ignore it please. We are busy building product and will update it soon. But would love to connect directly and tell you more. Since my original comment, there has already been an animated discussion within our small team about how we can use Deepgram!!! 😄
Noah Shutty
@kartikparija @stephensonsco We'd love to get in touch! send us a message at :D
Kartik Parija
@noajshu @stephensonsco Did that. Look forward to connecting directly. And I love @rrhoover suggestion that you connect with @annewootton This came up in our internal discussion about Deepgram as well. She and her team are doing some smashing work with podcast and radio transcription.
Angad Singh
@kartikparija @adorilabs Would encourage you to also check out ( We have a bunch of audio producers using the product already :)
David Carpe
I'd love to see a demo where you parse the presidential debates (both parties) and then present buttons to search for mention of key issues - that would be dope, and may drive some insane traffic.
Scott Stephenson
@passingnotes This is something we spent time on a few months ago when we made which was almost identically what you described (FC has changed to be a simpler demo now, but it hasn't been maintained in months). We certainly think political issues are a great way to use search but we found that FC needed some sort of community manager to constantly find good content and remove duplicate content. One really cute finding is that politicians tell the same jokes over and over and over ...
Noah Shutty
@stephensonsco @passingnotes The same jokes and the same truisms, verbal fillers, etc. It's actually pretty scary.
Noah Shutty
@passingnotes exactly...or "The simple fact is..."
Omri Shabi
Nice concept. What's your next milestone?
Scott Stephenson
@omrishabi I bet @noajshu and I will have a couple things we want to get done! We love to make the product more accurate and faster. A big part of bumping up accuracy is training our neural networks to be resilient to different types of audio. Got a jackhammer going off in the background audio? We want to make sure that has no effect on the search performance. Also, searching massive datasets is a challenge that we actively pursue. Right now we can search through millions of minutes in a second, but why not billions?
Noah Shutty
@stephensonsco @omrishabi definitely agree on these two goals. Also want to improve the content navigation UI and crawl media data on the web.
Hey guys, this is amazing!!! I think this will be very useful for analysing sporting videos. For example, analyzing a football/soccer match. Normally football clubs have teams that watch the opponents matches and manually record stats of different situations. For example, number of touches for a player (how many times the player name is mentioned), fouls, red cards, offsides, mistakes and so on. Imagine having your tech do this automatically based on the parameters that the team wants! I'd be interested in helping out if you decide to work on this use case :). Cheers.
Noah Shutty
@burrewoo Whoa--this is a use case we never thought of. Analytics for football competitors. I'd love to talk more about this. Are the fouls, red cards, etc. marked by specific phrases?
@burrewoo i can see this working as an extra layer/aide to scouting, with proper training, perhaps not a replacement. think there are too many variables and limitations within normal videos that might limit the output value. i have an active interest in the footie+data space - happy to chat if you're a keen fan
@noajshu Hi Noah, awesome :), you can follow me on Twitter @burrewoo. Some of the actions are but not all of them such as red cards and major events like scoring a goal, controversial goals (offside/foul causing a goal) and so on ...
Bill Doerrfeld
If I was Youtube, I would buy you.
Scott Stephenson
@doerrfeldbill We were thinking of buying them.
Rudy Yazdi
Sounds like a good idea :)
Noah Shutty
@rdyazdi Thanks Rudy! What would you use it for?
Imran Ghory
Speech analysis is something I've been looking at recently for a fintech company but from testing the standard API transcription services (those from IBM, AT&T, etc.) the quality hasn't been great. Is exposing raw transcript something you're looking at, even if it's a probabilistic transcript ? - search is one use case but we've also got other use cases for which we'd want raw data (predicting conversion, risk, fraud, etc.)
Scott Stephenson
@imranghory We definitely can expose the transcript (there's a DeepGram API call for that), but the error rate in the transcript is highly dependent on the input audio quality (better quality audio has better transcripts, phone calls in noisy cars don't). However, our analysis techniques don't rely solely on the transcript being perfect, which is a feature that really sets us apart—especially for medium to poor quality audio. On prediction: We've built AI prediction layers to do what you are mentioning (predicting outcomes) but we don't rely on the text in the transcript being perfect, we build it on top of our fuzzy key phrase search. Contact us if you need that sort of thing!
Imran Ghory
@stephensonsco our current dataset is recorded phone calls in mp3 so the greatest, most models that are trained on phone data (i.e. dealing with narrowband) fail on our data due to the lossy compression on mp3. We're looking at switching to uncompressed call recording though. What's the pricing structure for search and the API ?
Noah Shutty
@imranghory @stephensonsco We'd be happy to talk about our pricing and set up a demo for your call audio! Shoot us a quick message at and we'll be in touch.
Jaswinder Brar
Consider these new Use Cases: 1) Associated multiple language sound bytes!.. For people to use in foreign countries as a handheld translator! 2) Voice mail generator. Thanks, Jaswinder Brar.
Noah Shutty
@jay_bee12345 We're also very interested in multiple languages (right now it's only English) As for 2), you should check out (our mashup generator) if you want to use the search to make a great voicemail sound clip. Reply with the URL if you make a mashup!
Noah Shutty
Any lifeloggers up in here?
Scott Stephenson
@noajshu You are revealing our secret plot to index the world. :-P
Nice. Searches speech but wont accept speech as input :)
Scott Stephenson
@nj_raju I know right?! This is something we want to add soon. Getting it working on different platforms while not frustrating users is a challenge, though.
Ben Myers
Not sure what's up but no matter what I search, it only plays back Hoover Dam related video. I'm on Chrome 49 Mac OSX
Scott Stephenson
@iamhabitat Hey Ben! Thanks for the feedback. Are you searching on our homepage? That demo is only for searching within that single creative commons video about the Hoover Dam—we don't yet provide search that is akin to Google for Audio/Video (but we certainly work on it). You can search through other files files (YouTube videos, your own recorded memos, things like that) by creating an account and uploading them to the DeepGram console. Let us know if you have more feedback (also, we have a slack channel—!)
Chris Strom
I love the concept - eager to use it! I tried to index a YouTube video and got a Failed status. Any ideas why this could happen?
Scott Stephenson
@marketplicity We'll check this out right away. What's the YouTube URL? You can also drop us a line at with any problems. Thanks for letting us know!
Ouriel Ohayon
that would be killer to use this to built a true search engine for podcasts.
Scott Stephenson
@ourielohayon I totally agree! Which platforms would you want to search on?
Abe Storey
I love this. Freakin' brilliant and seamless.. Can you do product analysis @stephensonsco ?
Scott Stephenson
@abe_storey We might be able to. What kind of product analysis are you looking for?
Abe Storey
@stephensonsco What's your email? I'll dm you more.
Noah Shutty
@abe_storey @stephensonsco Reach out to contact -at- deepgram -dot- com
Noah Shutty
@abe_storey @stephensonsco Or just leave us a note at :)
Someone needs to go redesign the site. The search doesn't seem to work. Just keeps playing the same video. I think it skips to parts of the video where the search keyword can be found? No idea. I searched "apple" and nothing happened.
Scott Stephenson
@topcities Sorry that it's annoying. There is only a single video to search through as a demo and 'apple' isn't mentioned in the video. You can upload videos from YouTube or your personal audio/video stash by creating an account and dropping it into the console. Let us know how that goes for you!
@stephensonsco The technology is cool. With better messaging on the site I think you'd increase your signup and engagement rates a lot.
Noah Shutty
@topcities @stephensonsco Thanks! Do you have any tips for improving our messaging?
@noajshu @stephensonsco Sure. 1. Don't make users sign up before trying it. Let users play with it a bit then ask them to register. 2. The video demo has many usability issues. The search box insinuates a user can search for a video, not text within the video. The pink buttons feel like they let users paginate to the next video search result. Might be better if you just made a video explaining the value proposition right above the fold. 3. I didn't realize there was more stuff below the video, because I got stuck at the video section and left shortly after because I didn't really get it right away. 4. The slogan didn't hook me because I didn't get it right away. I was just scanning so probably that's the reason. Perhaps make it even simpler for the average joe to get it. Something like "Search through speech within videos" might be easier to understand. You can always explain in more detail later after the user is hooked.