Microsoft's Cortana boosts speech recognition accuracy

Harvey Weinstein's rape and sexual assault convictions in New York are overturned as appeals court rules judge 'unfairly' allowed women to testify who were not part of case
Banks SPIED on Trump voters: How 13 financial institutions worked with feds 'without warrants' to comb transactions of pro-gun and religious MAGA supporters after Jan 6
Dietician reveals 10 popular foods that are 'sneakily' making YOU gain weight - laying bare shocking number of calories hidden in so many seemingly-healthy items
Finance guru Dave Ramsey explains the key to early retirement - and it has nothing to do with your job
The reasons I'm now suspicious about Meghan's motives as her podcast struggles to relaunch
Woman nearly killed by INSECTS in her airport vending machine coffee reveals she went BLIND shortly after spotting 'little antennas and wings' in her drink and doctors say she's lucky to be alive
Billionaire heiress, 16, whose mom founded Flickr and dad founded Slack vanishes in San Francisco's notorious drug-riddled Tenderloin district as cops issue plea for help to find her
Now 'Crackhead Barney' claims Alec Baldwin MAIMED her during coffee shop confrontation as she appears in surreal interview wearing a diaper and no bra
Trump meets cheering NYC construction workers ahead of huge day where Supreme Court will consider ex-president's criminal immunity as he sits in hush money trial
AOC's Democratic challenger Marty Dolan accuses her of 'screwing up' New York by 'inviting' illegal migrants to take over and pushing 'radical' out-of-touch ideas
Nervous fliers - look away now! Lufthansa Boeing 747 bounces hard off LAX runway twice during aborted landing that expert says is the roughest he's ever seen
How did they miss it? TSA at Oklahoma airport are in hot water for failing to spot hunting bullets that left father facing 12 years jail in the Turks and Caicos
NYU anti-Israel demonstrator who says she's come down from Columbia to join activists admits she doesn't know what she's protesting about
Trump says being president would be 'ceremonial' if there was no immunity
Moment male Episcopal bishop sparked sexism storm by ripping off female reverend's collar for VERY un-Christian reason... as drag queen looked on
Free-flowing drinks and majestic views of the American Rockies: Is this the most spectacular train ride in the world?
Should YOU be using deodorizing your UNDERBOOB? Dermatologists reveal the truth about full-body deodorant trend - and whether the intimate products are REALLY safe for your skin
Hamas hostage Hersh Goldberg-Polin's parents share their emotional reaction to proof-of-life video
Why is Meghan Keeping up with the Kardashians? Duchess chose 'Momager' Kris Jenner to receive 13th jar of American Riviera Orchard jam in another surprising link with LA's most famous family
Heartbroken woman, 37, reveals how 'IVF RUINED her life' after she sold her home to raise $165,000 for six FAILED treatments in desperate bid to become a mother - only to suffer a devastating miscarriage
'America's worst mayor' Tiffany Henyard is slapped with yet another lawsuit - this time from a local barber shop - as she clings on to $300K a year job amid embezzlement and sexual harassment claims
Biden demands Hamas release ALL hostages and an 'immediate ceasefire' in Gaza
Sacha Baron Cohen breaks his silence after Rebel Wilson's redacted memoir was released in the UK - amid legal battle over 'a**hole' chapter
'I'm with Hamas and Hezbollah': Professor was hired to teach at Columbia despite repeatedly backing 'Islamic Jihad' after October 7 invasion of Israel

Microsoft made improvements to its conversational speech recognition system
This resulted in a 5.1 per cent margin of error in line with trained professionals
The firm achieved a 5.9 per cent error rate equal to the average person last year
The Washington-based company is now setting its sights on getting machines to understand the meaning behind the words they recognise

Published: 04:17 EDT, 21 August 2017 | Updated: 04:58 EDT, 21 August 2017

e-mail

A new milestone in human speech recognition has been reached by Microsoft, matching the accuracy of trained human transcribers.

The firm's software, used in its Cortana voice assistant, has achieved a 5.1 per cent margin of error, putting it on a par with professionals.

One of the big frustrations of voice recognition has been getting machines to accept commands, a process which often involves repetition and exaggerated speech.

The development means the company's products will soon accept orders with super-human precision.

Scroll down for video

A new milestone in human speech recognition has been reached by Microsoft, matching the accuracy of trained human transcribers. The firm's software, used in its Cortana voice assistant (pictured), achieved a 5.1 per cent error rate, putting it on a par with professionals

WHAT NEXT?

Microsoft says it is now turning its attention to solving some of the remaining challenges facing speech recognition, as well as teaching machines to understand what they hear.

Some problems still to be addressed by the software include achieving human levels of recognition in noisy environments with distant microphones as well as recognising accented speech or speaking styles and languages for which only limited training data is available.

Microsoft says they have much work to do in teaching computers not just to transcribe the words spoken, but also to understand their meaning and intent.

The firm believes moving from recognising to understanding speech is the next major frontier for speech technology.'

The findings were published in a technical report published by Microsoft on Saturday.

Last year, researchers from Microsoft Artificial Intelligence and Research reached a 5.9 per cent error rate, the same as the average person.

The new paper details how experts used improvements in AI to refine its conversational speech recognition system.

This allows the system to better recognise the waveform of speech patterns, moment to moment and word to word.

It also uses the context of a conversation to predict what is likely to come next.

The technology is used in the company's Cortana voice assistant that allows users to perform a range of tasks, from checking the weather to chatting.

It also provides a voice translation service.

Writing on the Microsoft Research blog, technical fellow Xuedong Huang said: 'Reaching human parity with an accuracy on par with humans has been a research goal for the last 25 years.

'Microsoft's willingness to invest in long-term research is now paying dividends for our customers in products and services such as Cortana, Presentation Translator, and Microsoft Cognitive Services.

'It's deeply gratifying to our research teams to see our work used by millions of people each day.'

Switchboard is a body of recorded telephone conversations that the speech research community has used for more than 20 years to test voice recognition systems.

The task involves transcribing conversations between strangers discussing topics ranging from sports to politics.

Previous research has shown that humans achieve higher levels of agreement on the precise words spoken as they expend more care and effort, as in the case of professional transcribers. This images shows some of the options available through the Cortana voice assistant

Previous research has shown that humans achieve higher levels of agreement on the precise words spoken as they expend more care and effort, as in the case of professional transcribers.

Microsoft says it is now turning its attention to solving some of the remaining challenges facing speech recognition, as well as teaching machines to understand what they hear.

Mr Huang added: 'While achieving a 5.1 per cent word error rate on the Switchboard speech recognition task is a significant achievement, the speech research community still has many challenges to address.

Microsoft says it is now turning its attention to solving some of the remaining challenges facing speech recognition, as well as teaching machines to understand what they hear. This image shows the firm's voice translation service

'[This includes] achieving human levels of recognition in noisy environments with distant microphones, in recognising accented speech, or speaking styles and languages for which only limited training data is available.

'Moreover, we have much work to do in teaching computers not just to transcribe the words spoken, but also to understand their meaning and intent.

'Moving from recognising to understanding speech is the next major frontier for speech technology.'