执行完以上命令之后,假如没有出现什么报错信息,就阐明曾经安装成功了,但是此时你的命令并不可以失效,在运转命令时会出现这样的错误。
imyin@develop:~/Downloads/phinx/sphinxbase$ sphinx_lm_convert
sphinx_lm_convert: error while loading shared libraries: libsphinxbase.so.3: cannot open shared object file: No such file or directory
相关办法在其官网上可以找到,也有相应的教程。感兴味的冤家可以自行研讨。
Q: Why my accuracy is poor
Speech recognition accuracy is not always great. To test speech recognition you need to run recognition on prerecorded reference database to see what happens and optimize parameters.
You do not need to play with unknown values, the first thing you should do is to collect a database of test samples and measure the recognition accuracy. You need to dump speech utterances into wav files, write the reference text file and use decoder to decode it. Then calculate WER using the word_align.pl tool from Sphinxtrain. Test database size depends on the accuracy but usually it’s enough to have 10 minutes of transcribed audio to test recognizer accuracy reliably. The process is described in tutorialtuning.
不过侥幸的是,speech_recognition支持将语音文件停止截取处理。例如,我可以只处理语音文件中的前15秒钟的内容。
with test as source:
audio = r.record(source, duration=15)
r.recognize_google(audio, language='zh-CN')
'那一年的7月里我去了一趟希腊有独自从雅典跑到马拉松江哪条原始的马拉松道路马拉松直雅典一想跑上一趟'
从下面的结果看,几乎比sphnix处理的效果好太多了。
经过看协助文档发现speech_recognition不只可以截取后面的录音,还可以截取中间的。
In [18]: r.record?
Signature: r.record(source, duration=None, offset=None)
Docstring:
Records up to ``duration`` seconds of audio from ``source`` (an ``AudioSource`` instance) starting at ``offset`` (or at the beginning if not specified) into an ``AudioData`` instance, which it returns.
If ``duration`` is not specified, then it will record until there is no more audio input.
例如我想处理5秒至20秒之间的内容。
with test as source:
audio = r.record(source, offset=5, duration=15)
r.recognize_google(audio, language='zh-CN')