Traceback (most recent call last):
File "E:/simulation/PGCartPole.py", line 59, in <module>
action = RL.choose_action(observation)
File "E:/simulation/PG.py", line 50, in choose_action
action = np.random.choice(range(actions.shape[1]), p=actions.view(-1).detach().numpy())
File "mtrand.pyx", line 929, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities contain NaN
跑的是Policy Gradients算法的仿真,看报错应该是动作随机选择那里有action的概率是无穷大的,该怎么解决呢?