MADDPG问题求助 - 深度强化学习实验室

MADDPG问题求助

XGZ

奖励为什么中间下降又上升

Learner

XGZ 多智能体不太了解，不过奖励上下波动很正常

stepNeverStop

非常正常，可能模型在中间学习到了local optimal策略

Document