scook@gear21.com
Revision History | ||
---|---|---|
Revision v2.0 | April 19, 2002 | Revised by: scc |
Changed license information (now GFDL) and added a new publication. | ||
Revision v1.2 | February 5, 2002 | Revised by: scc |
Added more commercial software listings (sent by Mayur Patel). | ||
Revision v1.1 | October 5, 2001 | Revised by: scc |
Added info for Vocalis Speechware. Fixed/Updated various other items. | ||
Revision v1.0 | November 20, 2000 | Revised by: scc |
Added info on L and H and HTK | ||
Revision v0.5 | September 13, 2000 | Revised by: scc |
Initial HOWTO Submission |
This document is made available under the terms of the GNU Free Documentation License (GFDL), which is hereby incorporated by reference.
For the most recent version of this document, check the LDP archive, or go to: http://www.gear21.com/speech/index.html.
I would like to thank the following people for the help, reviewing, and support of this document:
If you have any comments, suggestions, revisions, updates, or just want to chat about ASR, please send an email to me at scook@gear21.com.
The following definitions are the basics needed for understanding speech recognition technology.
This software is primarily for users. An RPM is available.
HomePage: http://www.compapp.dcu.ie/~tdoris/Xvoice/ http://www.zachary.com/creemer/xvoice.html
This software is primarily for users.
Homepage: http://www.kiecza.de/daniel/linux/index.html
Documents: http://www.kiecza.de/daniel/linux/cvoicecontrol/index.html
This software is primarily for developers.
Homepage: http://www.speech.cs.cmu.edu/sphinx/Sphinx.html
Source: http://download.sourceforge.net/cmusphinx/sphinx2-0.1a.tar.gz
This software is primarily for developers.
FTP site: ftp://svr-ftp.eng.cam.ac.uk/comp.speech/recognition/
If you know of free software that isn't included in the above list, please send me a note at: scook@gear21.com. If you're in the mood, you can also send me where to get a copy of the software, and any impressions you may have about it. Thanks!
The SDKs and Kits are available for free at: http://www-4.ibm.com/software/speech/dev/sdk_linux.html
More information on Vocalis and Vocalis Speechware is available at: http://www.vocalisspeechware.com and http://www.vocalis.com.
K.K. Chin advised me that the original developers of the HTK (the Speech Vision and Robotic Group at Cambridge) are still providing support for it. There is also a "free" version available at: http://htk.eng.cam.ac.uk. Also note that Microsoft still owns the copyright to the current HTK code...
There are rumors of more commercial ASR products becoming available in the near future (including L&H). I talked with a couple of L&H representatives at Comdex 2000 (Vegas) and none of them could give me any information on a Linux release, or even if they planned on releasing any products for Linux. If you have any further information, please send any details to me at scook@gear21.com.
Most recognizers can be broken down into the following steps:
Framing and Windowing (chopping the data into a usable format)
Filtering (further filtering of each window/frame/freq. band)
Action (Perform function associated with the recognized pattern)
(6) Actions can be just about anything the developer wants. *GRIN*
If there is a publication that is not on this list, that you think should be, please send the information to me at: scook@gear21.com.
"Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993. ISBN: 0130151572.
"Applied Speech Technology". A. Syrdal, R. Bennett, S. Greenspan. 1994. ISBN: 0849394562.
"Digital Processing of Speech Signals" L. Rabiner, R. Schafer. 1978. ISBN: 0132136031
"Designing Effective Speech Interfaces". S. Weinschenk, D. T. Barker. 2000. ISBN: 0471375454.
Newsgroup dedicated to computer and speech.
Newsgroup dedicated to users of speech software.
Newsgroup dedicated to speech software and hardware research.
Speech Recognition on Linux Mailing List.
ASR software and accessories. http://www.speechtechnology.com
Microphones and accessories for ASR. http://www.microphones.com
"Speech Recognition Specialists." http://voicerecognition.com
Primarily for Windows users, but good info. http://www.out-loud.com
"The Speech Recognition Information Source." http://www.sayican.com