开源ASR效果
背景
效果
Paraformer实时
[1]
| model | clean(CER%) | common(CER%) | RTF |
|---|---|---|---|
| Paraformer | 9.73 | 12.96 | 0.0093 |
UniASR
[3]
| model | clean(CER%) | common (CER%) |
|---|---|---|
| offline | 5.84 | 9.73 |
| normal | 6.11 | 10.42 |
| fast(900ms) | 8.60 | 12.67 |
Paraformer-large
[2]
AISHELL-1
| AISHELL-1 test | w/o LM | w/ LM |
|---|---|---|
| Espnet | 4.90 | 4.70 |
| Wenet | 4.61 | 4.36 |
| K2 | - | 4.26 |
| Blockformer | 4.29 | 4.05 |
| Paraformer-large | 1.95 | 1.68 |
AISHELL-2
| dev_ios | test_android | test_ios | test_mic | |
|---|---|---|---|---|
| Espnet | 5.40 | 6.10 | 5.70 | 6.10 |
| WeNet | - | - | 5.39 | - |
| Paraformer-large | 2.80 | 3.13 | 2.85 | 3.06 |
Wenetspeech测试集
| dev | test_meeting | test_net | |
|---|---|---|---|
| Espnet | 9.70 | 15.90 | 8.80 |
| WeNet | 8.60 | 17.34 | 9.26 |
| K2 | 7.76 | 13.41 | 8.71 |
| Paraformer-large | 3.57 | 6.97 | 6.74 |
WeNet-U2pp
[4]
| Decoding mode - Chunk size | Dev (CER%) | Test_Net (CER%) | Test_Meeting (CER%) |
|---|---|---|---|
| ctc greedy search - full | 8.85 | 9.78 | 17.77 |
| ctc greedy search - 16 | 9.32 | 11.02 | 18.79 |
| ctc prefix beam search - full | 8.80 | 9.73 | 17.57 |
| ctc prefix beam search - 16 | 9.25 | 10.96 | 18.62 |
| attention rescoring - full | 8.60 | 9.26 | 17.34 |
| attention rescoring - 16 | 8.87 | 10.22 | 18.11 |