Commit
·
8c53520
1
Parent(s):
4cb792b
Update README.md
Browse files
README.md
CHANGED
|
@@ -217,25 +217,25 @@ The performance may vary depending on the prompt. For BLOOMZ models, we recommen
|
|
| 217 |
|
| 218 |
## Model
|
| 219 |
|
| 220 |
-
- Architecture
|
| 221 |
-
- Finetuning steps
|
| 222 |
-
- Finetuning tokens
|
| 223 |
-
- Finetuning layout
|
| 224 |
-
- Precision
|
| 225 |
|
| 226 |
## Hardware
|
| 227 |
|
| 228 |
-
-
|
| 229 |
-
- 8 GPUs per node using NVLink 4 inter-gpu connects, 4 OmniPath links
|
| 230 |
-
- NCCL-communications network
|
| 231 |
-
|
| 232 |
|
| 233 |
## Software
|
| 234 |
|
| 235 |
-
- [Megatron-DeepSpeed](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
|
| 236 |
-
- [DeepSpeed](https://github.com/microsoft/DeepSpeed)
|
| 237 |
-
- [PyTorch](https://github.com/pytorch/pytorch) (pytorch-1.11 w/ CUDA-11.5)
|
| 238 |
-
- [apex](https://github.com/NVIDIA/apex)
|
| 239 |
|
| 240 |
# Evaluation
|
| 241 |
|
|
|
|
| 217 |
|
| 218 |
## Model
|
| 219 |
|
| 220 |
+
- **Architecture:** Same as [bloom](https://huggingface.co/bigscience/bloom), also refer to the `config.json` file
|
| 221 |
+
- **Finetuning steps:** 498
|
| 222 |
+
- **Finetuning tokens:** 2.09 billion
|
| 223 |
+
- **Finetuning layout:** 72x pipeline parallel, 1x tensor parallel, 4x data parallel
|
| 224 |
+
- **Precision:** bfloat16
|
| 225 |
|
| 226 |
## Hardware
|
| 227 |
|
| 228 |
+
- **CPUs:** AMD CPUs with 512GB memory per node
|
| 229 |
+
- **GPUs:** 288 A100 80GB GPUs (36 nodes) with 8 GPUs per node using NVLink 4 inter-gpu connects, 4 OmniPath links
|
| 230 |
+
- **Communication:** NCCL-communications network with a fully dedicated subnet
|
| 231 |
+
|
| 232 |
|
| 233 |
## Software
|
| 234 |
|
| 235 |
+
- **Orchestration:** [Megatron-DeepSpeed](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
|
| 236 |
+
- **Optimizer & parallelism:** [DeepSpeed](https://github.com/microsoft/DeepSpeed)
|
| 237 |
+
- **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch) (pytorch-1.11 w/ CUDA-11.5)
|
| 238 |
+
- **FP16 if applicable:** [apex](https://github.com/NVIDIA/apex)
|
| 239 |
|
| 240 |
# Evaluation
|
| 241 |
|