Microsoft’s Speech Recognition Tech Achieves Human Parity--Sort Of

Lucian Armasu · Oct 19, 2016

Microsoft announced its latest major milestone in machine learning-based speech recognition, which now equals the word error rate (WER) of humans. However, the WER doesn't tell the whole story, and machines still have a way to go in understanding humans.

Microsoft’s Speech Recognition Tech Achieves Human Parity--Sort Of : Read more

jaber2 · Oct 19, 2016

Just want to know how long until universal translator

kittle · Oct 19, 2016

I wonder how that thing will work when plugged into siri on a noisy subway?

Icepilot · Oct 19, 2016

"... we're living in a time when machines are beginning to truly understand humans and the world around us."
One word too far, truly.

stuartturner34 · Oct 19, 2016

“We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is an historic achievement.”

A typo in a quote of a scientist talking about word error ratings. So meta.

alextheblue · Oct 20, 2016

kittle :

That depends on your audio hardware/software more than anything. For example on a PC, it would depend on the type and quality of the microphone / mic array, the sound card, audio drivers, recording software, etc. There's a couple of places where there's opportunities for noise cancellation, depending on the gear and ware used. The result gets handed to this translation software, garbage in garbage out - you have to feed it good audio for it to do it's job. The situation isn't all that different for a smartphone. Unfortunately the iPhone probably wouldn't do the best job compared to a smartphone with a HAAC twin membrane quad-mic array.

jackt · Oct 21, 2016

using super computers or normal pc ?

Kafantaris · Oct 22, 2016

Microsoft's AI driven voice recognition has left all rivals in the dust. Great work by Dr. Xuedong Huang's speech team.

bit_user · Oct 30, 2016

Good job digging into the error rates, Lucian.

jackt :

This is the question I had. How much compute does it use? It's not a small detail whether this requires a long time on a big GPU, or whether it can run on a smartphone in realtime. If too much compute is required, then this won't be deployed in most real-world uses cases for years.

BTW, humans are still way more energy efficient.

bit_user · Oct 30, 2016

stuartturner34 :

It would be, but where's the error?

Search

Microsoft’s Speech Recognition Tech Achieves Human Parity--Sort Of

Lucian Armasu

Contributing Writer

jaber2

Distinguished

kittle

Distinguished

Icepilot

Distinguished

stuartturner34

Reputable

alextheblue

Distinguished

jackt

Distinguished

Kafantaris

Reputable

bit_user

Titan

bit_user

Titan

TRENDING THREADS

Latest posts

Moderators online

Share this page