日期: 2025-11-22 类型: 功能开发、性能优化、架构重构 状态: 已完成
from core.lightweight_dqn import LightweightDQN as custom_modelresume_model: 恢复训练的模型路径
./checkpoints/model.pthsamples_per_epoch: 每个epoch使用的样本数
self.history = {'epoch': [], 'loss': []}best_model.pthtraining_history_时间戳.png: loss曲线图(对数坐标)training_history_时间戳.txt: 训练数据文本文件plot_training_history() 方法samples_per_epoch >= buffer_size:
samples_per_epoch < buffer_size:
sample_indices(sample_size) -> List[int]: 核心随机采样逻辑get_dataloader(batch_size, shuffle, device, sample_size) -> DataLoader: 返回PyTorch DataLoader
BufferDataset(Dataset): PyTorch Dataset包装器
__getitem__ 中将数据转为tensorsample(batch_size): 旧的采样方法,已被DataLoader替代self.history: 训练历史记录self.best_loss: 最佳loss值self.samples_per_epoch: 每个epoch使用的样本数self.use_sampling: 是否需要每个epoch重新采样__init__():
train():
_compute_loss():
plot_training_history(save_dir): 绘制训练曲线并保存数据Buffer.samples (numpy arrays in CPU memory)
-> sample_indices() 随机采样索引
-> BufferDataset.__getitem__() 转为tensor并传到GPU
-> DataLoader 自动batch和shuffle
-> Trainer.train() 直接使用GPU上的tensor
# 判断是否需要每个epoch重新采样
if samples_per_epoch >= buffer_size:
# 使用所有数据,只构建一次
dataloader = buffer.get_dataloader(...)
for epoch in range(num_epochs):
for batch in dataloader:
train_step(batch)
else:
# 每个epoch重新采样
for epoch in range(num_epochs):
dataloader = buffer.get_dataloader(..., sample_size=samples_per_epoch)
for batch in dataloader:
train_step(batch)
config/config.py: 添加resume_model和samples_per_epoch字段config/agent.config.yaml: 添加对应配置项core/trainer.py: 完全重构训练流程,添加可视化功能data/training_buffer.py: 添加DataLoader支持,删除旧的sample方法scripts/train.py: 训练结束后调用绘图方法docs/design/ARCHITECTURE.mdconfig/agent.config.yaml