开源ASR效果

  |   0 评论   |   0 浏览

背景

效果

Paraformer实时

[1]

modelclean(CER%)common(CER%)RTF
Paraformer9.7312.960.0093

UniASR

[3]

modelclean(CER%)common (CER%)
offline5.849.73
normal6.1110.42
fast(900ms)8.6012.67

Paraformer-large

[2]

AISHELL-1

AISHELL-1 testw/o LMw/ LM
Espnet4.904.70
Wenet4.614.36
K2-4.26
Blockformer4.294.05
Paraformer-large1.951.68

AISHELL-2

dev_iostest_androidtest_iostest_mic
Espnet5.406.105.706.10
WeNet--5.39-
Paraformer-large2.803.132.853.06

Wenetspeech测试集

devtest_meetingtest_net
Espnet9.7015.908.80
WeNet8.6017.349.26
K27.7613.418.71
Paraformer-large3.576.976.74

WeNet-U2pp

[4]

Decoding mode - Chunk sizeDev (CER%)Test_Net (CER%)Test_Meeting (CER%)
ctc greedy search - full8.859.7817.77
ctc greedy search - 169.3211.0218.79
ctc prefix beam search - full8.809.7317.57
ctc prefix beam search - 169.2510.9618.62
attention rescoring - full8.609.2617.34
attention rescoring - 168.8710.2218.11

参考

  1. Paraformer语音识别-中文-通用-16k-实时
  2. Paraformer语音识别-中文-通用-16k-离线-large-pytorch
  3. UniASR语音识别-中文-通用-16k-实时
  4. WeNet-U2pp_Conformer-语音识别-中文-16k-实时