Speech to text technology has been around for over half a century. It has experienced tremendous developments, with AI being the latest. Nevertheless, its accuracy is still dependent upon certain external parameters, such as audio quality.
What do we mean by audio quality?
A rough scale of it ranges from BBC studio level audio to incomprehensible. Such a scale is mainly determined by three factors: recording equipment, distance between speakers and equipment, and background noise.
Tips for getting the best recording
Remember, prevention is much better than a cure. Follow these handy hits below from our friends at LearnUpon:
Use a good quality microphone. We use blue yeti microphones. They’re relatively inexpensive whilst also being high quality (here’s a guide on how to get the most out of them). And if you’re recording in a noisy environment use dynamic microphones.
Choose your place of recording carefully, insulated from street noise etc.
Turn off anything in the room that’s generating ambient noise, for example, a computer or air conditioning. We suggest sit in the room and make a short recording, when you play it back it’ll help you to identify any ambient noises.
If you find there’s a low background hiss on your test recording try reducing the gain on the microphone a little.
Get closer to the microphone, but don’t get too close! Extend your hand fully and the distance between your thumb to your pinky finger is about the optimum distance you should be positioned from the mic.
Run a few test recordings to ensure you’re happy with the setup.
Think about your future self!
So despite huge advances in technology, still the best indicator of accuracy is audio quality. If find yourself a sound studio, think about using an external microphone as it will save you time downstream when editing the transcript.