Update README.md
Browse files
README.md
CHANGED
|
@@ -147,18 +147,6 @@ model-index:
|
|
| 147 |
This model transcribes speech in lower case English alphabet along with spaces and apostrophes.
|
| 148 |
It is a "extra-large" versions of Conformer-Transducer (around 600M parameters) model.
|
| 149 |
|
| 150 |
-
## NVIDIA Riva: Deployment
|
| 151 |
-
|
| 152 |
-
If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
|
| 153 |
-
|
| 154 |
-
Additionally, Riva provides:
|
| 155 |
-
|
| 156 |
-
* World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
|
| 157 |
-
* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
|
| 158 |
-
* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
|
| 159 |
-
|
| 160 |
-
Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
| 161 |
-
|
| 162 |
## NVIDIA NeMo: Training
|
| 163 |
|
| 164 |
To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest Pytorch version.
|
|
@@ -203,6 +191,18 @@ This model accepts 16000 KHz Mono-channel Audio (wav files) as input.
|
|
| 203 |
|
| 204 |
This model provides transcribed speech as a string for a given audio sample.
|
| 205 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 206 |
## Model Architecture
|
| 207 |
|
| 208 |
Conformer-Transducer model is an autoregressive variant of Conformer model [1] for Automatic Speech Recognition which uses Transducer loss/decoding instead of CTC Loss. You may find more info on the detail of this model here: [Conformer-CTC Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html).
|
|
|
|
| 147 |
This model transcribes speech in lower case English alphabet along with spaces and apostrophes.
|
| 148 |
It is a "extra-large" versions of Conformer-Transducer (around 600M parameters) model.
|
| 149 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
## NVIDIA NeMo: Training
|
| 151 |
|
| 152 |
To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest Pytorch version.
|
|
|
|
| 191 |
|
| 192 |
This model provides transcribed speech as a string for a given audio sample.
|
| 193 |
|
| 194 |
+
## NVIDIA Riva: Deployment
|
| 195 |
+
|
| 196 |
+
If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
|
| 197 |
+
|
| 198 |
+
Additionally, Riva provides:
|
| 199 |
+
|
| 200 |
+
* World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
|
| 201 |
+
* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
|
| 202 |
+
* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
|
| 203 |
+
|
| 204 |
+
Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
| 205 |
+
|
| 206 |
## Model Architecture
|
| 207 |
|
| 208 |
Conformer-Transducer model is an autoregressive variant of Conformer model [1] for Automatic Speech Recognition which uses Transducer loss/decoding instead of CTC Loss. You may find more info on the detail of this model here: [Conformer-CTC Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html).
|