Posters sessions for a paper are the same day as the presentation day.
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Rafael Rafailov, Kyle Beltran Hatch, Anikait Singh, Aviral Kumar, Laura Smith, Ilya Kostrikov, Philippe Hansen-Estruch, Victor Kolev, Philip J. Ball, Jiajun Wu, Sergey Levine, Chelsea Finn
Harnessing Discrete Representations for Continual Reinforcement Learning
Edan Jacob Meyer, Adam White, Marlos C. Machado
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning
Davide Corsi, Davide Camponogara, Alessandro Farinelli
Investigating the Interplay of Prioritized Replay and Generalization
Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Kartik Choudhary, Dhawal Gupta, Philip S. Thomas
Resource Usage Evaluation of Discrete Model-Free Deep Reinforcement Learning Algorithms
Olivia P. Dizon-Paradis, Stephen E. Wormald, Daniel E. Capecci, Avanti Bhandarkar, Damon L. Woodard
OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments
Quentin Delfosse, Jannis Blüml, Bjarne Gregori, Sebastian Sztwiertnia, Kristian Kersting
The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning
Andrew Patterson, Samuel Neumann, Raksha Kumaraswamy, Martha White, Adam White
An Open-Loop Baseline for Reinforcement Learning Locomotion Tasks
Antonin Raffin, Olivier Sigaud, Jens Kober, Alin Albu-Schaeffer, João Silvério, Freek Stulp
Combining Automated Optimisation of Hyperparameters and Reward Shape
Julian Dierkes, Emma Cramer, Holger Hoos, Sebastian Trimpe
Robotic Manipulation Datasets for Offline Compositional Reinforcement Learning
Marcel Hussing, Jorge Mendez-Mendez, Anisha Singrodia, Cassandra Kent, Eric Eaton
Stable-Baselines3: Reliable Reinforcement Learning Implementations
Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, Noah Dormann
A Provably Efficient Option-Based Algorithm for both High-Level and Low-Level Learning
Gianluca Drappo, Alberto Maria Metelli, Marcello Restelli
Bandits with Multimodal Structure
Hassan SABER, Odalric-Ambrym Maillard
Policy Gradient with Active Importance Sampling
Matteo Papini, Giorgio Manganini, Alberto Maria Metelli, Marcello Restelli
Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes
He Wang, Laixi Shi, Yuejie Chi
Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits
Woojin Jeong, Seungki Min
A Batch Sequential Halving Algorithm without Performance Degradation
Sotetsu Koyamada, Soichiro Nishimori, Shin Ishii
Graph Neural Thompson Sampling
Shuang Wu, Arash A. Amini
A Tighter Convergence Proof of Reverse Experience Replay
Nan Jiang, Jinzhao Li, Yexiang Xue
Cost Aware Best Arm Identification
Kellen Kanarios, Qining Zhang, Lei Ying
Towards Principled, Practical Policy Gradient for Bandits and Tabular MDPs
Michael Lu, Matin Aghaei, Anant Raj, Sharan Vaswani
Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Javad Azizi, Thang Duong, Yasin Abbasi-Yadkori, András György, Claire Vernade, Mohammad Ghavamzadeh
Causal Contextual Bandits with Adaptive Context
Rahul Madhavan, Aurghya Maiti, Gaurav Sinha, Siddharth Barman
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang, Remi Tachet des Combes, Romain Laroche
Co-Learning Empirical Games & World Models
Max Olan Smith, Michael P. Wellman
Best Response Shaping
Milad Aghajohari, Tim Cooijmans, Juan Agustin Duque, Shunichi Akatsuka, Aaron Courville
Shield Decomposition for Safe Reinforcement Learning in General Partially Observable Multi-Agent Environments
Daniel Melcer, Christopher Amato, Stavros Tripakis
Quantifying Interaction Level Between Agents Helps Cost-efficient Generalization in Multi-agent Reinforcement Learning
Yuxin Chen, Chen Tang, Thomas Tian, Chenran Li, Jinning Li, Masayoshi Tomizuka, Wei Zhan
Reinforcement Learning from Delayed Observations via World Models
Armin Karamzade, Kyungmin Kim, Montek Kalsi, Roy Fox
Cyclicity-Regularized Coordination Graphs
Oliver Järnefelt, Mahdi Kallel, Carlo D'Eramo
Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization
Aditya Kapoor, Benjamin Freed, Jeff Schneider, Howie Choset
Trust-based Consensus in Multi-Agent Reinforcement Learning Systems
Ho Long Fung, Victor-Alexandru Darvariu, Stephen Hailes, Mirco Musolesi
Inception: Efficiently Computable Misinformation Attacks on Markov Games
Jeremy McMahan, Young Wu, Yudong Chen, Jerry Zhu, Qiaomin Xie
Human-compatible driving agents through data-regularized self-play reinforcement learning
Daphne Cornelisse, Eugene Vinitsky
On Welfare-Centric Fair Reinforcement Learning
Cyrus Cousins, Kavosh Asadi, Elita Lobo, Michael Littman
BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations
Robert J. Moss, Anthony Corso, Jef Caers, Mykel Kochenderfer
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Marcel Hussing, Claas A Voelcker, Igor Gilitschenski, Amir-massoud Farahmand, Eric Eaton
Mixture of Experts in a Mixture of RL settings
Timon Willi, Johan Samir Obando Ceron, Jakob Nicolaus Foerster, Gintare Karolina Dziugaite, Pablo Samuel Castro
Light-weight Probing of Unsupervised Representations for Reinforcement Learning
Wancong Zhang, Anthony GX-Chen, Vlad Sobal, Yann LeCun, Nicolas Carion
Zero-shot cross-modal transfer of Reinforcement Learning policies through a Global Workspace
Léopold Maytié, Benjamin Devillers, Alexandre Arnold, Rufin VanRullen
PASTA: Pretrained Action-State Transformer Agents
Raphael Boige, Yannis Flet-Berliac, Lars C.P.M. Quaedvlieg, Arthur Flajolet, Guillaume Richard, Thomas PIERROT
Combining Reconstruction and Contrastive Methods for Multimodal Representations in RL
Philipp Becker, Sebastian Mossburger, Fabian Otto, Gerhard Neumann
A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning
Abdulaziz Almuzairee, Nicklas Hansen, Henrik I Christensen
On the consistency of hyper-parameter selection in value-based deep reinforcement learning
Johan Samir Obando Ceron, João Guilherme Madeira Araújo, Aaron Courville, Pablo Samuel Castro
Policy-Guided Diffusion
Matthew Thomas Jackson, Michael Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob Nicolaus Foerster
SplAgger: Split Aggregation for Meta-Reinforcement Learning
Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson
Learning to Optimize for Reinforcement Learning
Qingfeng Lan, A. Rupam Mahmood, Shuicheng YAN, Zhongwen Xu
Investigating the properties of neural network representations in reinforcement learning
Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White
Cheap and Deterministic Inference for Deep State-Space Models of Interacting Dynamical Systems
Andreas Look, Barbara Rakitsch, Melih Kandemir, Jan Peters
Learning Action-based Representations Using Invariance
Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang
Representation Alignment from Human Feedback for Cross-Embodiment Reward Learning from Mixed-Quality Demonstrations
Connor Mattson, Anurag Sidharth Aribandi, Daniel S. Brown
Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Johannes Ackermann, Takayuki Osa, Masashi Sugiyama
Offline Diversity Maximization under Imitation Constraints
Marin Vlastelica, Jin Cheng, Georg Martius, Pavel Kolev
Imitation Learning from Observation through Optimal Transport
Wei-Di Chang, Scott Fujimoto, David Meger, Gregory Dudek
ROIL: Robust Offline Imitation Learning without Trajectories
Gersi Doko, Guang Yang, Daniel S. Brown, Marek Petrik
Agent-Centric Human Demonstrations Train World Models
James Staley, Elaine Short, Shivam Goel, Yash Shukla
Inverse Reinforcement Learning with Multiple Planning Horizons
Jiayu Yao, Weiwei Pan, Finale Doshi-Velez, Barbara E Engelhardt
Semi-Supervised One Shot Imitation Learning
Philipp Wu, Kourosh Hakhamaneshi, Yuqing Du, Igor Mordatch, Aravind Rajeswaran, Pieter Abbeel
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang, Honghao Wei, Lei Ying
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning
Nicholas E. Corrado, Yuxiao Qu, John U. Balis, Adam Labiosa, Josiah P. Hanna
Reward (Mis)design for autonomous driving☆
W. Bradley Knox, Alessandro Allievi, Holger Banzhaf, Felix Schmitt, Peter Stone
Models of human preference for learning reward functions
W. Bradley Knox, Stephane Hatgis-Kessell, Serena Booth, Scott Niekum, Peter Stone, Alessandro G Allievi
The Cliff of Overcommitment with Policy Gradient Step Sizes
Scott M. Jordan, Samuel Neumann, James E. Kostas, Adam White, Philip S. Thomas
Demystifying the Recency Heuristic in Temporal-Difference Learning
Brett Daley, Marlos C. Machado, Martha White
When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning
Claas A Voelcker, Tyler Kastner, Igor Gilitschenski, Amir-massoud Farahmand
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan, Ziping Xu
States as goal-directed concepts: an epistemic approach to state-representation learning
Nadav Amir, Yael Niv, Angela J Langdon
Tiered Reward: Designing Rewards for Specification and Fast Learning of Desired Behavior
Zhiyuan Zhou, Shreyas Sundara Raman, Henry Sowerby, Michael Littman
Unifying Model-Based and Model-Free Reinforcement Learning with Equivalent Policy Sets
Benjamin Freed, Thomas Wei, Roberto Calandra, Jeff Schneider, Howie Choset
Multistep Inverse Is Not All You Need
Alexander Levine, Peter Stone, Amy Zhang
An Idiosyncrasy of Time-discretization in Reinforcement Learning
Kris De Asis, Richard S. Sutton
Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL
Miguel Suau, Matthijs T. J. Spaan, Frans A Oliehoek
Mitigating the Curse of Horizon in Monte-Carlo Returns
Alex Ayoub, David Szepesvari, Francesco Zanini, Bryan Chan, Dhawal Gupta, Bruno Castro da Silva, Dale Schuurmans
Structure in Deep Reinforcement Learning: A Survey and Open Problems
Aditya Mohan, Amy Zhang, Marius Lindauer
Sequential Decision-Making for Inline Text Autocomplete
Rohan Chitnis, Shentao Yang, Alborz Geramifard
A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo
Miguel Vasco, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Peter R. Wurman, Peter Stone
Towards General Negotiation Strategies with End-to-End Reinforcement Learning
Bram M. Renting, Thomas M. Moerland, Holger Hoos, Catholijn M Jonker
JoinGym: An Efficient Join Order Selection Environment
Junxiong Wang, Kaiwen Wang, Yueying Li, Nathan Kallus, Immanuel Trummer, Wen Sun
Policy Architectures for Compositional Generalization in Control
Allan Zhou, Vikash Kumar, Chelsea Finn, Aravind Rajeswaran
Verification-Guided Shielding for Deep Reinforcement Learning
Davide Corsi, Guy Amir, Andoni Rodríguez, Guy Katz, César Sánchez, Roy Fox
Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning
Zakariae EL ASRI, Olivier Sigaud, Nicolas THOME
Bidirectional-Reachable Hierarchical Reinforcement Learning with Mutually Responsive Policies
Yu Luo, Fuchun Sun, Tianying Ji, Xianyuan Zhan
RL for Consistency Models: Reward Guided Text-to-Image Generation with Fast Inference
Owen Oertell, Jonathan Daniel Chang, Yiyi Zhang, Kianté Brantley, Wen Sun
Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps
Linfeng Zhao, Lawson L.S. Wong
Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning
Gautham Vasan, Yan Wang, Fahim Shahriar, James Bergstra, Martin Jägersand, A. Rupam Mahmood
Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras
Mhairi Dunion, Stefano V Albrecht
Emergent behaviour and neural dynamics in artificial agents tracking odour plumes
Satpreet H. Singh, Floris van Breugel, Rajesh P. N. Rao, Bingni W. Brunton
GVFs in the real world: making predictions online for water treatment
Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White
Weight Clipping for Deep Continual and Reinforcement Learning
Mohamed Elsayed, Qingfeng Lan, Clare Lyle, A. Rupam Mahmood
Policy Gradient Algorithms with Monte Carlo Tree Learning for Non-Markov Decision Processes
Tetsuro Morimura, Kazuhiro Ota, Kenshi Abe, Peinan Zhang
ROER: Regularized Optimal Experience Replay
Changling Li, Zhang-Wei Hong, Pulkit Agrawal, Divyansh Garg, Joni Pajarinen
Learning Discrete World Models for Heuristic Search
Forest Agostinelli, Misagh Soltani
Boosting Soft Q-Learning by Bounding
Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, Rahul V Kulkarni
Reward Centering
Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton
Stabilizing Extreme Q-learning by Maclaurin Expansion
Motoki Omura, Takayuki Osa, YUSUKE Mukuta, Tatsuya Harada
Contextualized Hybrid Ensemble Q-learning: Learning Fast with Control Priors
Emma Cramer, Bernd Frauenknecht, Ramil Sabirov, Sebastian Trimpe
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart
PID Accelerated Temporal Difference Algorithms
Mark Bedaywi, Amin Rakhsha, Amir-massoud Farahmand
SwiftTD: A Fast and Robust Algorithm for Temporal Difference Learning
Khurram Javed, Arsalan Sharifnassab, Richard S. Sutton
Posterior Sampling for Continuing Environments
Wanqiao Xu, Shi Dong, Benjamin Van Roy
Off-Policy Actor-Critic with Emphatic Weightings
Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White
The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
Riccardo Zamboni, Duilio Cirino, Marcello Restelli, Mirco Mutti
Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning
Adriana Hugessen, Roger Creus Castanyer, Faisal Mohamed, Glen Berseth
Exploring Uncertainty in Distributional Reinforcement Learning
Georgy Antonov, Peter Dayan
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling
Haque Ishfaq, Yixin Tan, Yu Yang, Qingfeng Lan, Jianfeng Lu, A. Rupam Mahmood, Doina Precup, Pan Xu
Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
Trevor McInroe, Adam Jelley, Stefano V Albrecht, Amos Storkey
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance
Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater
Online Planning in POMDPs with State-Requests
Raphaël Avalos, Eugenio Bargiacchi, Ann Nowe, Diederik Roijers, Frans A Oliehoek
Informed POMDP: Leveraging Additional Information in Model-Based RL
Gaspard Lambrechts, Adrien Bolland, Damien Ernst
Bounding-Box Inference for Error-Aware Model-Based Reinforcement Learning
Erin J Talvitie, Zilei Shao, Huiying Li, Jinghan Hu, Jacob Boerma, Rory Zhao, Xintong Wang
Dreaming of Many Worlds: Learning Contextual World Models aids Zero-Shot Generalization
Sai Prasanna, Karim Farid, Raghu Rajan, André Biedenkapp
Learning Abstract World Models for Value-preserving Planning with Options
Rafael Rodriguez-Sanchez, George Konidaris
Granger Causal Interaction Skill Chains
Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum
On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning
Philipp Becker, Gerhard Neumann
Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models
Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White