Update README.md
Browse files
README.md
CHANGED
|
@@ -51,7 +51,65 @@ python infer.py
|
|
| 51 |
- 14k audio hours
|
| 52 |
- English only
|
| 53 |
|
| 54 |
-
Dataset is
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
### Citation
|
| 57 |
|
|
@@ -67,6 +125,18 @@ Dataset is open available in [HF Dataset](https://huggingface.co/datasets/nguyen
|
|
| 67 |
keywords={Training;Adaptation models;Limiting;Predictive models;Data models;Robustness;Multilingual;Data mining;Speech processing;Standards;speaker-attributed;asr;multilingual},
|
| 68 |
doi={10.1109/ICASSP49660.2025.10889116}}
|
| 69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
```
|
| 71 |
|
| 72 |
### License
|
|
|
|
| 51 |
- 14k audio hours
|
| 52 |
- English only
|
| 53 |
|
| 54 |
+
Dataset is openly available in [HF Dataset](https://huggingface.co/datasets/nguyenvulebinh/spk-attribute)
|
| 55 |
+
|
| 56 |
+
*Example*
|
| 57 |
+
|
| 58 |
+
Audio
|
| 59 |
+
|
| 60 |
+
<audio controls>
|
| 61 |
+
<source src="https://huggingface.co/nguyenvulebinh/MSA-ASR/resolve/main/sample_augment.wav" type="audio/wav">
|
| 62 |
+
Your browser does not support the audio element.
|
| 63 |
+
</audio>
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
Label:
|
| 67 |
+
|
| 68 |
+
```code
|
| 69 |
+
spk_1 A 0.00 1.58 »spk_1
|
| 70 |
+
spk_1 A 0.00 1.58 Pacifica
|
| 71 |
+
spk_1 A 1.58 0.68 continues
|
| 72 |
+
spk_1 A 2.27 0.52 today
|
| 73 |
+
spk_1 A 2.79 0.24 to
|
| 74 |
+
spk_1 A 3.03 0.20 be
|
| 75 |
+
spk_1 A 3.23 0.14 a
|
| 76 |
+
spk_1 A 3.37 0.54 listener
|
| 77 |
+
spk_1 A 3.91 0.80 supported
|
| 78 |
+
spk_1 A 4.71 0.70 network
|
| 79 |
+
spk_1 A 5.42 0.38 of
|
| 80 |
+
spk_2 A 5.80 0.12 »spk_2
|
| 81 |
+
spk_2 A 5.80 0.12 At
|
| 82 |
+
spk_2 A 5.92 0.42 home,
|
| 83 |
+
spk_2 A 6.34 0.18 an
|
| 84 |
+
spk_2 A 6.52 0.38 Aed
|
| 85 |
+
spk_2 A 6.90 0.26 is
|
| 86 |
+
spk_2 A 7.16 0.18 an
|
| 87 |
+
spk_2 A 7.34 0.56 automated
|
| 88 |
+
spk_2 A 7.90 0.60 external
|
| 89 |
+
spk_2 A 8.50 0.90 defibrillator.
|
| 90 |
+
spk_2 A 9.40 0.40 It's
|
| 91 |
+
spk_2 A 9.81 0.08 the
|
| 92 |
+
spk_2 A 9.89 0.36 device
|
| 93 |
+
spk_2 A 10.25 0.08 you
|
| 94 |
+
spk_2 A 10.33 0.16 use
|
| 95 |
+
spk_2 A 10.49 0.12 when
|
| 96 |
+
spk_2 A 10.61 0.10 your
|
| 97 |
+
spk_2 A 10.73 0.16 heart
|
| 98 |
+
spk_2 A 10.89 0.18 goes
|
| 99 |
+
spk_2 A 11.07 0.12 into
|
| 100 |
+
spk_2 A 11.19 0.38 cardiac
|
| 101 |
+
spk_2 A 11.57 0.38 arrest
|
| 102 |
+
spk_2 A 11.95 0.18 to
|
| 103 |
+
spk_2 A 12.13 0.36 shock
|
| 104 |
+
spk_2 A 12.49 0.14 it
|
| 105 |
+
spk_2 A 12.63 0.28 back
|
| 106 |
+
spk_2 A 12.91 0.22 into
|
| 107 |
+
spk_2 A 13.13 0.06 a
|
| 108 |
+
spk_2 A 13.19 0.32 normal
|
| 109 |
+
spk_2 A 13.51 0.88 rhythm.
|
| 110 |
+
spk_1 A 14.40 1.38 »spk_1
|
| 111 |
+
spk_1 A 14.40 1.38 stations.
|
| 112 |
+
```
|
| 113 |
|
| 114 |
### Citation
|
| 115 |
|
|
|
|
| 125 |
keywords={Training;Adaptation models;Limiting;Predictive models;Data models;Robustness;Multilingual;Data mining;Speech processing;Standards;speaker-attributed;asr;multilingual},
|
| 126 |
doi={10.1109/ICASSP49660.2025.10889116}}
|
| 127 |
|
| 128 |
+
@INPROCEEDINGS{10446589,
|
| 129 |
+
author={Nguyen, Thai-Binh and Waibel, Alexander},
|
| 130 |
+
booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
|
| 131 |
+
title={Synthetic Conversations Improve Multi-Talker ASR},
|
| 132 |
+
year={2024},
|
| 133 |
+
volume={},
|
| 134 |
+
number={},
|
| 135 |
+
pages={10461-10465},
|
| 136 |
+
keywords={Systematics;Error analysis;Knowledge based systems;Oral communication;Signal processing;Data models;Acoustics;multi-talker;asr;synthetic conversation},
|
| 137 |
+
doi={10.1109/ICASSP48485.2024.10446589}}
|
| 138 |
+
|
| 139 |
+
|
| 140 |
```
|
| 141 |
|
| 142 |
### License
|