【经验之谈】RL研究者该如何做科研？ OpenAI研究者的6类23条经验之谈。

实验室官方助手

参考

[1] Deep Reinforcement Learning Doesn’t Work Yet, Alex Irpan, 2018

[2] Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control, Islam et al, 2017

[3] Deep Reinforcement Learning that Matters, Henderson et al, 2017

[4] Lessons Learned Reproducing a Deep Reinforcement Learning Paper, Matthew Rahtz, 2018

[5] UCL Course on RL

[6] Berkeley Deep RL Course

[7] Deep RL Bootcamp

[8] Nuts and Bolts of Deep RL, John Schulman

[9] Stanford Deep Learning Tutorial: Multi-Layer Neural Network

[10] The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy, 2015

[11] LSTM: A Search Space Odyssey, Greff et al, 2015

[12] Understanding LSTM Networks, Chris Olah, 2015

[13] Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Chung et al, 2014 (GRU paper)

[14] Conv Nets: A Modular Perspective, Chris Olah, 2014

[15] Stanford CS231n, Convolutional Neural Networks for Visual Recognition

[16] Deep Residual Learning for Image Recognition, He et al, 2015 (ResNets)

[17] Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau et al, 2014 (Attention mechanisms)

[18] Attention Is All You Need, Vaswani et al, 2017

[19] A Simple Weight Decay Can Improve Generalization, Krogh and Hertz, 1992

[20] Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava et al, 2014

[21] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe and Szegedy, 2015

[22] Layer Normalization, Ba et al, 2016

[23] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, Salimans and Kingma, 2016

[24] Stanford Deep Learning Tutorial: Stochastic Gradient Descent

[25] Adam: A Method for Stochastic Optimization, Kingma and Ba, 2014

[26] An overview of gradient descent optimization algorithms, Sebastian Ruder, 2016

[27] Auto-Encoding Variational Bayes, Kingma and Welling, 2013 (Reparameterization trick)

[28] Tensorflow

[29] PyTorch

[30] Spinning Up

本文转载自（中文翻译版本）： https://spinningup.qiwihui.com/zh_CN/latest/spinningup/spinningup.html#id85

英文原文：https://spinningup.openai.com/en/latest/spinningup/spinningup.html

Document