Linux/macOS 安装 Kaldi-编程知识网


文章目录

    • 一、关于 kaldi
    • 二、安装
      • 1、下载源码
      • 2、查看 INSTALL 文件
        • root — INSTALL
        • tools — INSTALL
      • 3、处理tools
        • 报错 ERROR: cannot verify
        • 安装 mkl
        • 安装 irstlm、kaldi_lm、openblas
      • 4、处理 src
    • 三、测试
      • 报错1:Bad FST header
      • 报错2:gmm-init-mono: command not found
      • 报错3:arpa2fst: command not found

一、关于 kaldi

Kaldi is a toolkit for speech recognition, intended for use by speech recognition researchers and professionals.

  • 官网 : https://www.kaldi-asr.org
  • Github : https://github.com/kaldi-asr/kaldi
  • 已有的模型:https://www.kaldi-asr.org/models.html
  • 官方文档:https://www.kaldi-asr.org/doc/

参考

  • ubuntu 18.04 安装Kaldi教程(总结安装过程中碰到的坑)
    https://zhuanlan.zhihu.com/p/148524930
  • AssemblyAI / kaldi-install-tutorial
    https://github.com/AssemblyAI/kaldi-install-tutorial/blob/main/setup.sh

二、安装

1、下载源码

你可以从 https://github.com/kaldi-asr/kaldi 直接下载;


也有用户反馈是用这个版本更好:

git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden

网络不好可以在这里下载:https://download.csdn.net/download/lovechris00/87301550


2、查看 INSTALL 文件

root – INSTALL

根目录下的 INSTALL 内容为:

This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[Option 1 in the following does not apply to native Windows install, see windows/INSTALL or following Option 2]Option 1 (bash + makefile):Steps:(1) go to tools/  and follow INSTALL instructions there.(2) go to src/ and follow INSTALL instructions there.Option 2 (cmake):Go to cmake/ and follow INSTALL.md instructions there.Note, it may not be well tested and some features are missing currently.

tools – INSTALL

tools 下的 INSTALL 文件内容为:

To check the prerequisites for Kaldi, first run

  extras/check_dependencies.sh

and see if there are any system-level installations you need to do. Check the output carefully. There are some things that will make your life a lot easier if you fix them at this stage. If your system default C++ compiler is not supported, you can do the check with another compiler by setting the CXX environment variable, e.g.

  CXX=g++-4.8 extras/check_dependencies.sh

Then run

  make

which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe.
OpenFst requires a relatively recent C++ compiler with C++11 support, e.g.g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3.
If your system default compiler does not have adequate support for C++11, you can specify a C++11
compliant compiler as a command argument, e.g.

  make CXX=g++-4.8

3、处理tools

从根目录进入 tools 文件夹

cd tools# 检查
./extras/check_dependencies.sh

如果缺少什么包,这个脚本会提示你安装;
macOS 下使用 brew install xxx 来安装


编译

make -j 4

运行这个脚本,会下载第三方软件包,并自动解压;
如果后续软件安装失败(没有安装、包大小有问题),可以再次执行 make 命令;
没有自动解压的就手动解压一下。


报错 ERROR: cannot verify

ERROR: cannot verify www.openfst.org’s certificate, issued by ‘CN=R3,O=Let’s Encrypt,C=US’:
Issued certificate has expired.
To connect to www.openfst.org insecurely, use `–no-check-certificate’.
ERROR: cannot verify www.openslr.org’s certificate, issued by ‘CN=R3,O=Let’s Encrypt,C=US’:


此时需要修改 Makefile

找到 www.openfst.org 的位置:
以下是原来的内容

openfst-$(OPENFST_VERSION).tar.gz:if [ -d "$(DOWNLOAD_DIR)" ]; then \cp -p "$(DOWNLOAD_DIR)/openfst-$(OPENFST_VERSION).tar.gz" .; \else \$(WGET) -nv -T 10 -t 1 http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-$(OPENFST_VERSION).tar.gz  || \$(WGET) -nv -T 10 -t 3 -c https://www.openslr.org/resources/2/openfst-$(OPENFST_VERSION).tar.gz; \fi

(WGET) 命令后添加 --no-check-certificate 参数:

$(WGET) -nv -T 10 -t 1 http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-$(OPENFST_VERSION).tar.gz --no-check-certificate || \$(WGET) -nv -T 10 -t 3 -c https://www.openslr.org/resources/2/openfst-$(OPENFST_VERSION).tar.gz --no-check-certificate; \

然后再次运行 make 命令


make 第三方包

make openfst
make cub
make sclite
make sph2pipe

后面过程中如果出现报错:you may not have installed OpenFst 一般都是因为这里没有编译好 OpenFst。
参考文章:https://blog.csdn.net/weixin_42103947/article/details/119842650


安装 mkl

linux 可以使用下面命令安装:

./extras/install_mkl.sh

Mac 上执行命令会报错:

./extras/install_mkl.sh: This script can be used on Linux only, and your system is Darwin.
Installer packages for Mac and Windows are available for download from Intel:


你需要前往下面网站下载:
https://software.intel.com/mkl/choose-download

这里我下载的是离线安装包,点击app安装即可。

Linux/macOS 安装 Kaldi-编程知识网


安装 irstlm、kaldi_lm、openblas

sudo ./extras/install_irstlm.shsudo ./extras/install_kaldi_lm.shsudo ./extras/install_openblas.sh

4、处理 src

src 是和 tools 平行的 src 文件夹
从 tools 切换到 src

cd ../src
./configure --shared

如果使用 cuda,需要执行以下代码:

./configure --use-cuda --cudatk-dir=/usr/local/cuda-11.3/

你可以使用 nvcc -V 查看 cuda 版本。上面以 cuda-11.3为例。
cuda 一般安装在 /usr/local/cuda-v.xx 下,这里设置为你的 cuda 地址就好。


否则后续make 过程会报以下错误:

fatal error: cublas_v2.h: No such file or directory #include <cublas_v2.h>

这个错误如果只是将 /usr/local/cuda-11.3/targets/x86_64-linux/lib/usr/local/cuda-11.3/targets/x86_64-linux/include 添加到环境变量,是没法解决的。


 make depend -j 8make -j 8

三、测试

在kaldi目录下

cd egs/yesno/s5
./run.sh

如果得到类似下方结果,代表基本运行成功(kaldi安装成功)

steps/diagnostic/analyze_lats.sh: see stats in exp/mono0a/decode_test_yesno/log/analyze_lattice_depth_stats.log
local/score.sh --cmd utils/run.pl data/test_yesno exp/mono0a/graph_tgpr exp/mono0a/decode_test_yesno
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
%WER 0.00 [ 0 / 232, 0 in , 0 del, 0  ub ] exp/mono0a/decode_te t_ye no/wer_10_0.0

报错1:Bad FST header

如果你出现下述报错:

ERROR: FstHeader::Read: Bad FST header: standard input

需要将 openfst bin 目录添加到环境变量;
你也可以添加到 egs/yesno/s5/path.sh

export FST_PATH='/Users/xx/kaldi-trunk/tools/openfst-1.7.2/bin'

然后执行

source path.sh 
./run.sh 

报错2:gmm-init-mono: command not found

run.pl: job failed, log is in exp/mono0a/log/init.log

# gmm-init-mono --shared-phones=data/lang/phones/sets.int "--train-feats=ark,s,cs:apply-cmvn  --utt2spk=ark:data/train_yesno/split1/1/utt2spk scp:data/train_yesno/split1/1/cmvn.scp scp:data/train_yesno/split1/1/feats.scp ark:- | add-deltas  ark:- ark:- | subset-feats --n=10 ark:- ark:-|" data/lang/topo 39 exp/mono0a/0.mdl exp/mono0a/tree 
# Started at Fri Dec 16 20:27:09 CST 2022
#
bash: line 1: gmm-init-mono: command not found
# Accounting: time=0 threads=1
# Ended (code 127) at Fri Dec 16 20:27:09 CST 2022, elapsed time 0 seconds

根据猜测,gmm-init-mono 是个命令工具,但终端找不到他的地址;
经过搜索 kaldi 文件夹,可以发现它位于 src/gmmbin/gmm-init-mono 目录下,那么将这个目录添加到环境变量;
macOS 下是 ~/.bash_profile, linux 下是 ~/.bashrc

export GMMBIN_PATH='/Users/XX/XX/XX/kaldi-trunk/src/gmmbin'

报错3:arpa2fst: command not found

local/prepare_lm.sh: line 13: arpa2fst: command not found

arpa2fst 命令位于 xx/kaldi-trunk/src/lmbin 目录下,可以将这个目录,添加到环境变量

export PATH=$PATH:~/scode/kaldi-trunk/src/lmbin

然后继续执行

source ~/.bash_profile
./run.sh 

伊织 2022-12-16(五)