See axolotl config

axolotl version: 0.12.2

# 基础模型配置
base_model: Qwen/Qwen3-4B-Instruct-2507
load_in_8bit: false
load_in_4bit: false  # QLoRA才需要4bit

# LoRA 适配器配置 - 这是关键部分
adapter: lora  # 明确指定使用LoRA
lora_model_dir:  # 如果有预训练的LoRA权重可以在这里指定

# LoRA 具体参数
lora_r: 64
lora_alpha: 128
lora_dropout: 0.1
lora_target_modules:  # Qwen3模型的关键模块
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj
lora_target_linear: true  # 自动找到所有线性层
lora_fan_in_fan_out: false

# 数据集设置
chat_template: qwen3
datasets:
  - path: /workspace/train_dir_0927-02/goal_data.json
    type: chat_template
    roles_to_train: ["assistant"]
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

dataset_prepared_path:
val_set_size: 0.1
output_dir: /workspace/train_dir_0927-02/checkpoints

# 序列长度设置
sequence_len: 7000
pad_to_sequence_len: true
sample_packing: true
eval_sample_packing: false

# 训练超参数
num_epochs: 5
micro_batch_size: 8  # H100显存大
gradient_accumulation_steps: 2  # 8卡LoRA不需要太大的累积
eval_batch_size: 8

# 优化器设置
optimizer: adamw_torch_fused
lr_scheduler: cosine
learning_rate: 8e-5
warmup_ratio: 0.1
weight_decay: 0.01

# 精度设置
bf16: true  # H100支持bf16
tf32: true
gradient_checkpointing: true  # 节省显存
flash_attention: true

# 日志和保存
logging_steps: 30
evals_per_epoch: 1
saves_per_epoch: 1
save_total_limit: 3  # 只保留最新的3个checkpoint

# 多卡训练配置 - 使用DeepSpeed而不是FSDP
deepspeed: /workspace/axolotl/deepspeed_configs/zero2.json  # 或者直接内联配置
# 如果要内联DeepSpeed配置：
# deepspeed:
#   zero_optimization:
#     stage: 2  # Zero-2对LoRA效果很好
#     allgather_partitions: true
#     allgather_bucket_size: 2e8
#     reduce_scatter: true
#     reduce_bucket_size: 2e8
#     overlap_comm: true
#     contiguous_gradients: true
#   bf16:
#     enabled: true
#   gradient_accumulation_steps: 2
#   gradient_clipping: 1.0
#   train_batch_size: auto
#   train_micro_batch_size_per_gpu: auto

# 移除FSDP配置，因为LoRA不需要FSDP
# fsdp相关配置全部删除

# 其他优化
ddp_timeout: 3600  # DDP超时设置
ddp_find_unused_parameters: false  # LoRA通常不需要

# 可选：如果不想用DeepSpeed，可以用原生DDP
# 只需要删除deepspeed配置，Axolotl会自动使用DDP

workspace/train_dir_0927-02/checkpoints

This model is a fine-tuned version of Qwen/Qwen3-4B-Instruct-2507 on the /workspace/train_dir_0927-02/goal_data.json dataset. It achieves the following results on the evaluation set:

Loss: 0.0472
Memory/max Mem Active(gib): 114.6
Memory/max Mem Allocated(gib): 114.6
Memory/device Mem Reserved(gib): 115.83

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 2
total_train_batch_size: 128
total_eval_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
training_steps: 200

Training results

Training Loss	Epoch	Step	Validation Loss	Mem Active(gib)	Mem Allocated(gib)	Mem Reserved(gib)
No log	0	0	1.1065	88.14	88.14	88.77
0.3209	1.0	40	0.0662	114.6	114.6	115.83
0.0632	2.0	80	0.0543	114.6	114.6	115.83
0.0456	3.0	120	0.0501	114.6	114.6	115.83
0.0416	4.0	160	0.0479	114.6	114.6	115.83
0.0381	5.0	200	0.0472	114.6	114.6	115.83

Framework versions

PEFT 0.17.0
Transformers 4.55.2
Pytorch 2.6.0+cu126
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 6

Model tree for cjkasbdkjnlakb/agent-0927-02

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(127)

this model

cjkasbdkjnlakb
/

agent-0927-02

workspace/train_dir_0927-02/checkpoints

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for cjkasbdkjnlakb/agent-0927-02

Evaluation results