Warp2Search.net » News » April 2003 » Intel Researchers Teach Computers To 'Read Lips' To Improve Speech Recognition
Intel Researchers Teach Computers To 'Read Lips' To Improve Speech Recognition
Posted by: [PM] on: 04/29/2003 09:32 AM [ Print | 0 comment(s) ] · 797 views
Intel Corporation researchers have released software under an open-source license that allows developers to build computers that see and "read lips" the way humans do to better understand spoken commands.
Today's speech recognition algorithms work well when background noise is eliminated or a well-tuned headset is used, but their accuracy rapidly degrades when applications have to cope with noisy environments, such as public places. Combined with face detection algorithms from Intel's OpenCV computer vision library, Audio Visual Speech Recognition (AVSR) software enables computers to detect a speaker's face and track their mouth movements. Synchronizing video data with speech identification enables much more accurate speech recognition, enhancing a wide variety of computer applications in noisy environments. The AVSR software is part of Intel's OpenCV computer vision library, a toolbox of more than 500 imaging functions that helps researchers develop computer vision applications. "Intel wants to develop technology that allows computers to naturally interact with the world the way humans do. Human recognition is seldom based on a single type of information. We make decisions by combining information from a variety of sources," said Justin Rattner, Intel senior fellow, Enterprise Platform Group and director of Intel's Microprocessor Research Labs. "The addition of Audio/Video Speech Recognition code to Intel's OpenCV library is certain to drive research and development in vision-assisted speech recognition." Faster microprocessors, falling camera prices and ten times more video capture bandwidth from technologies like USB2 are all enabling real-time computer vision algorithms to run on mainstream PCs. OpenCV is designed to increase innovation in this field by providing source code for a wide range of computer vision and imaging functions. Since its release in 2000, OpenCV has seen over 500,000 downloads of code and has attracted more than 5,000 registered members to its user group. Developers are using OpenCV code in applications ranging from toys to industrial manufacturing. The software includes C source code for all of the library's functionality and a royalty-free redistribution license. Information about AVSR can be found at www.intel.com/research/mrl/research/avcsr.htm. The OpenCV web site is located at www.intel.com/research/mrl/research/opencv/. Individuals interested in joining the user group can register at groups.yahoo.com and then can subscribe by sending email to OpenCV at subscribe@yahoogroups.com.
Today's speech recognition algorithms work well when background noise is eliminated or a well-tuned headset is used, but their accuracy rapidly degrades when applications have to cope with noisy environments, such as public places. Combined with face detection algorithms from Intel's OpenCV computer vision library, Audio Visual Speech Recognition (AVSR) software enables computers to detect a speaker's face and track their mouth movements. Synchronizing video data with speech identification enables much more accurate speech recognition, enhancing a wide variety of computer applications in noisy environments. The AVSR software is part of Intel's OpenCV computer vision library, a toolbox of more than 500 imaging functions that helps researchers develop computer vision applications. "Intel wants to develop technology that allows computers to naturally interact with the world the way humans do. Human recognition is seldom based on a single type of information. We make decisions by combining information from a variety of sources," said Justin Rattner, Intel senior fellow, Enterprise Platform Group and director of Intel's Microprocessor Research Labs. "The addition of Audio/Video Speech Recognition code to Intel's OpenCV library is certain to drive research and development in vision-assisted speech recognition." Faster microprocessors, falling camera prices and ten times more video capture bandwidth from technologies like USB2 are all enabling real-time computer vision algorithms to run on mainstream PCs. OpenCV is designed to increase innovation in this field by providing source code for a wide range of computer vision and imaging functions. Since its release in 2000, OpenCV has seen over 500,000 downloads of code and has attracted more than 5,000 registered members to its user group. Developers are using OpenCV code in applications ranging from toys to industrial manufacturing. The software includes C source code for all of the library's functionality and a royalty-free redistribution license. Information about AVSR can be found at www.intel.com/research/mrl/research/avcsr.htm. The OpenCV web site is located at www.intel.com/research/mrl/research/opencv/. Individuals interested in joining the user group can register at groups.yahoo.com and then can subscribe by sending email to OpenCV at subscribe@yahoogroups.com.
« Messenger Plus! 2.10.32 Final Released · Intel Researchers Teach Computers To 'Read Lips' To Improve Speech Recognition
· Dell To Enter Entusiast Market: Dell XPS Gaming System »


