Speech Recognition in Windows Vista


Windows Vista comes with good speech recognition support. I carried out a small exercise to roughly evaluate its dictation system. I tried to dictate a small dummy job application on my laptop. First I wrote it manually and then tried to dictate exactly the same with inbuilt ASR. I spoke very carefully, slowly, very clearly and phrase by phrase. Some times I even deleted the sentences which got highly mis-recognized and spoke them again. so you can say that it was Human + Machine mixed task to achieve highest performance. I could achieve a satisfactory accuracy. Here are the results and some observations I had noted.

Original (hand typed)
Respected Sir,

I am computer science graduate from university of Oxford. I completed my graduation in 2004. Thereafter I worked with couple of companies located in united kingdom. But in 2008 because of recession there was lay off in my company. Thousands of employees had to loose their job. I was one of them.
I am applying for post of senior software engineer at your renowned organization. I am attaching herewith all my academic and professional details in my resume.

Looking forward to hear from you.

With Regards,

Dictated (slowly, discretely and phrase by phrase)
Respected so,

I'm a computer science graduate from University of Oxford.  I completed my position in 2004.  Thereafter I worked with couple of companies located in United Kingdom.  But until 2008 because of recession there was layoffs in my company.  Thousands of employees had to lose their job.  I was one of them.
I'm happening for post of senior software engineer at you and the node organization.  I'm attaching the unit for my academic and professional bidders in my resume.

With three guards,

Dictated (sentence by sentence but fluently)
The specter said,

I'm Computer science graduate from University of Oxford.  The computer my position because before.  Thereafter I worked with couple of companies located in United Kingdom.  But in 2008 because of the session there was a layoff in my company.  Thousands of employees at those that sell.  I was one of them.
I'm applying for also see a software engineer at 110 no provision.  The magazine unit on Monday can be and professional gators in mime resume.

Looking for player from you and

, john

For carefully dictated text :-
  1. Accuracy is really good for sentences which are simple and contain simple regular words. e.g. "I'm a computer science graduate from University of Oxford.", "Thousands of employees had to lose their job." These sentences get correctly detected in one go.
  2. phrases like "renowned organization" are very difficult of be dictated. After 2 times also it did not recognize it correctly. probably because word renowned is less common word. Same thing happened word "herewith", "graduation" etc.
  3. "Regards" is recognized as "three guards". This is more of recognizer's word insertion penalty issue which can be understood.
  4. "Layoff" is recognized "layoffs" which is also common in speech recognizers.
  5. years like 2004, 2008 were recognized in a single go !
  6. Words like period (.) comma (,) are also correctly recognized many times. But word "Enter" (carriage return) was not recognized correctly some times.
I also tried to read (carefully and slowly) some sentences from one technical manual. But the recognition was not satisfactory. For sentences which contain some abbriviations or technical data, dictation does not work well.

So right now one can choose this technology to dictate common applications, daily blogs, travelogues, simple "hi-hello" type emails etc. But it is useless for using in offices, writing technical / legal documents / whitepapers  where lot of uncommon, non standard words and abbriviations will occur.

No comments:

Post a Comment