FunASR-Paraformer离线模型初体验
背景
初体验
环境准备
见FunASR一文。
模型解码
模型为Paraformer语音识别-中文-通用-16k-离线-large-pytorch
paraformer_large.py
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch')
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
结果
2023-04-05 17:08:35,714 (asr_inference_pipeline:542) INFO: Computing the result of ASR ...
{'text': '欢迎大家来体验达摩院推出的语音识别模型'}
中模型试验
上面的大模型,转成onnx时,我本机内存不够。因此换成中模型继续试验。
模型解码
模型为Paraformer语音识别-中文-通用-16k-离线
paraformer.py
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1')
rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
结果
2023-04-05 17:27:50,618 (asr_inference_pipeline:542) INFO: Computing the result of ASR ...
{'text': '欢迎大家来体验达摩院推出的语音识别模型'}
模型转成onnx格式
pip install onnx onnxruntime
python -m funasr.export.export_model --model-name damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1 --export-dir ./export --type onnx --quantize True
结果
output dir: export/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1
onnx解码
安装 onnx 库
wget https://github.com/microsoft/onnxruntime/releases/download/v1.14.0/onnxruntime-linux-x64-1.14.0.tgz
此时目录结构如下:
$ ls
onnxruntime-linux-x64-1.14.0 speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1
其中模型目录的结构为:
find speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/finetune.yaml
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/config.yaml
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/decoding.yaml
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/README.md
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/example
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/example/asr_example.wav
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/configuration.json
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/am.mvn
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/model.pb
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/.msc
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/seg_dict
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/fig
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/fig/struct.png
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/fig/error_type.png
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/model.onnx
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/model_quant.onnx
speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/.mdl
源码编译grpc v1.52.0
详见本站 《grpc cmake编译初体验》一文。
简单来说:
安装 cmake > 3.11
wget https://github.com/Kitware/CMake/releases/download/v3.26.3/cmake-3.26.3-linux-x86_64.sh
sh cmake-3.26.3-linux-x86_64.sh
环境准备
sudo apt install openssl libssl-dev pkg-config
源码下载
git clone -b v1.52.0 --depth=1 https://github.com/grpc/grpc.git
cd grpc
git submodule update --init --recursive
编译
mkdir -p cmake/build
cd cmake/build
cmake -DCMAKE_BUILD_TYPE=Release -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF - **DgRPC_ZLIB_PROVIDER** =package - **DgRPC_PROTOBUF_PROVIDER** =package - **DgRPC_SSL_PROVIDER** =package ../..
确认编译成功
cd ~/build/grpc/examples/cpp/helloworld
mkdir build
cd build
cmake ..
make
$ ./greeter_server
Server listening on 0.0.0.0:50051
$ ./greeter_client
Greeter received: Hello world
编译和启动 onnx server
依赖
sudo apt-get install libfftw3-dev
安装 yaml-cpp 0.6.0
,源中的 yaml-cpp
有问题。直接安装也很方便快捷。
wget https://github.com/jbeder/yaml-cpp/archive/refs/tags/yaml-cpp-0.6.0.zip
mkdir build
cd build
cmake [-G generator] [-DYAML_BUILD_SHARED_LIBS=on|OFF] ..
make
sudo make install
进入 funasr/runtime/grpc
目录,以 rebuild.sh
为模板运行,注意修改其中的模型路径。
rm -rf cmake
mkdir -p cmake/build
cd cmake/build
cmake -DCMAKE_BUILD_TYPE=release ../.. -DONNXRUNTIME_DIR=~/workspace/onnx/export/damo/onnxruntime-linux-x64-1.14.0
make
结果
paraformer_server
启动grpc paraformer server
./cmake/build/paraformer_server port thread_num /path/to/model_file quantize(true or false)
./cmake/build/paraformer_server 10108 4 /data/asrmodel/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch false
如
./paraformer_server 10108 4 ~/workspace/onnx/export/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/ false
结果
ASRServicer init
Server listening on 0.0.0.0:10108
grpc python client测试
安装依赖
python3 -m pip install pyaudio webrtcvad grpcio grpcio-tools -i https://mirror.sjtu.edu.cn/pypi/web/simple
cd ../python/grpc
python3 grpc_main_client_mic.py --host 192.168.33.10 --port 10108
结果
* recording
{'success': True, 'detail': 'finish_sentence', 'server_delay_ms': 248, 'text': '一铪巂婚窣楮掊'}
这个模型效果堪忧。
直接使用python server
依赖包
安装funasr
pip install "modelscope[audio_asr]" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip install --editable ./
安装依赖
cd funasr/runtime/python/grpc
pip install -r requirements_server.txt
pip install funasr_onnx
运行
cd funasr/runtime/python/grpc
python grpc_main_server.py --port 10095 --backend onnxruntime --onnx_dir ~/workspace/onnx/export/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1