evilfreelancer
/

ruGPT-3.5-13B-lora

@@ -23,15 +23,31 @@ tags:
   - adapter
 ---
-# ruGPT-3.5 13B LoRA
-This is an adapter-only version, based on [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B).
-Training code is [here](https://github.com/EvilFreelancer/ruGPT-3.5-13B-lora)
-> You may use ruGPT-3.5 13B fp16 base model instead.
-## Training procedure
 The following `bitsandbytes` quantization config was used during training:
@@ -46,7 +62,9 @@ The following `bitsandbytes` quantization config was used during training:
 - bnb_4bit_use_double_quant: False
 - bnb_4bit_compute_dtype: float32
-### Framework versions
 - PyTorch 2.1.0
 - PEFT 0.5.0

   - adapter
 ---
+# ruGPT-3.5 13B LoRA: Adapter-Only Version
+Welcome to the adapter-only version of ruGPT-3.5 13B LoRA. This model is built upon the foundation of [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B).
+📌 Important: This model was trained using settings identical to [GigaSaiga](https://huggingface.co/IlyaGusev/gigasaiga_lora), but incorporates two additional datasets.
+🔗 Training code is [here](https://github.com/EvilFreelancer/ruGPT-3.5-13B-lora).
+> Note: If you prefer, you can opt to use the ruGPT-3.5 13B fp16 base model.
+## 📚 Training Datasets
+The datasets utilized for training this model are consistent with those used for [Saiga-2](https://github.com/IlyaGusev/rulm).
+Here's the comprehensive list:
+- [ru_turbo_alpaca](https://huggingface.co/datasets/IlyaGusev/ru_turbo_alpaca)
+- [ru_turbo_alpaca_evol_instruct](https://huggingface.co/datasets/IlyaGusev/ru_turbo_alpaca_evol_instruct)
+- [ru_turbo_saiga](https://huggingface.co/datasets/IlyaGusev/ru_turbo_saiga)
+- [ru_sharegpt_cleaned](https://huggingface.co/datasets/IlyaGusev/ru_sharegpt_cleaned)
+- [oasst1_ru_main_branch](https://huggingface.co/datasets/IlyaGusev/oasst1_ru_main_branch)
+- [gpt_roleplay_realm](https://huggingface.co/datasets/IlyaGusev/gpt_roleplay_realm)
+- [ru_instruct_gpt4](https://huggingface.co/datasets/lksy/ru_instruct_gpt4)
+## 🛠 Training Procedure
 The following `bitsandbytes` quantization config was used during training:
 - bnb_4bit_use_double_quant: False
 - bnb_4bit_compute_dtype: float32
+## ⚙️ Framework Versions
+Ensure you have the following framework versions for compatibility:
 - PyTorch 2.1.0
 - PEFT 0.5.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 2,
+  "eos_token_id": 3,
+  "pad_token_id": 0,
+  "transformers_version": "4.34.0",
+  "temperature": 0.2,
+  "top_p": 0.9,
+  "top_k": 30,
+  "do_sample": true,
+  "max_new_tokens": 1536,
+  "num_beams": 1,
+  "repetition_penalty": 1.15,
+  "no_repeat_ngram_size": 15
+}