【多智能体环境】DeepMind开源多智能体强化学习Melting Pot: 一种新兴易货交易行为模拟环境

实验室官方助手

Melting Pot

A suite of test scenarios for multi-agent reinforcement learning.

PDF: https://arxiv.org/pdf/2205.06760.pdf
Github: https://github.com/deepmind/meltingpot

About

Melting Pot assesses generalization to novel social situations involving both familiar and unfamiliar individuals, and has been designed to test a broad range of social interactions such as: cooperation, competition, deception, reciprocation, trust, stubbornness and so on. Melting Pot offers researchers a set of 21 multi-agent reinforcement learning substrates (multi-agent games) on which to train agents, and over 85 unique test scenarios on which to evaluate these trained agents. The performance of agents on these held-out test scenarios quantifies whether agents:

perform well across a range of social situations where individuals are interdependent,
interact effectively with unfamiliar individuals not seen during training
pass a universalization test: answering positively to the question: what if everyone behaved like that?

The resulting score can then be used to rank different multi-agent RL algorithms by their ability to generalize to novel social situations.

We hope Melting Pot will become a standard benchmark for multi-agent reinforcement learning. We plan to maintain it, and will be extending it in the coming years to cover more social interactions and generalization scenarios.

If you are interested in extending Melting Pot, please refer to the Extending Melting Pot documentation.

Installation

Melting Pot is built on top of DeepMind Lab2D. The installation steps are as follows (see install.sh for an example installation script):

(Optional) Activate a virtual environment, e.g.:

python3 -m venv "${HOME}/meltingpot_venv" source meltingpot_venv/bin/activate
Install dmlab2d from the dmlab2d wheel files, e.g.:

pip3 install https://github.com/deepmind/lab2d/releases/download/release_candidate_2021-07-13/dmlab2d-1.0-cp39-cp39-manylinux_2_31_x86_64.whl

If there is no appropriate wheel (e.g. M1 chipset) you will need to install dmlab2d and build the wheel yourself using bazel build -c opt --config=lua5_1 //dmlab2d:dmlab2d_wheel.
Test the dmlab2d installation in python3:

import dmlab2d import dmlab2d.runfiles_helper lab = dmlab2d.Lab2d(dmlab2d.runfiles_helper.find(), {"levelName": "chase_eat"}) env = dmlab2d.Environment(lab, ["WORLD.RGB"]) env.step({})
Install Melting Pot:

git clone -b main https://github.com/deepmind/meltingpot cd meltingpot curl -L https://storage.googleapis.com/dm-meltingpot/meltingpot-assets-1.0.0.tar.gz \\ | tar -xz --directory=meltingpot pip3 install .
Test the Melting Pot installation:

pip3 install pytest pytest meltingpot

Example usage

You can try out the substrates interactively with the human_players scripts. For example, to play the clean_up substrate, you can run:

python3 meltingpot/python/human_players/play_clean_up.py

You can move around with the W, A, S, D keys, Turn with Q, and E, fire the zapper with 1, and fire the cleaning beam with 2. You can switch between players with TAB. There are other substrates available in the human_players directory. Some have multiple variants, which you select with the --level_name flag.

NOTE: If you get a ModuleNotFoundError: No module named 'meltingpot.python' error, you can solve it by exporting the meltingpot home directory as PYTHONPATH (e.g. by calling export PYTHONPATH=$(pwd)).

Training agents

We provide a simple example using RLLib to train agents in self-play on a Melting Pot substrate. Note that Melting Pot is agnostic to how you train your agents, and this example is not meant to be a suggestion on how to achieve good scores in the task suite.

First you will need to install the dependencies needed by the examples:

cd <meltingpot_root> pip3 install -e .[examples]

Then you can run the training experiment using:

cd <meltingpot_root>/examples/rllib python3 self_play_train.py

Documentation

Full documentation is available here

Citing Melting Pot

If you use Melting Pot in your work, please cite the accompanying article:

@inproceedings{leibo2021meltingpot, title={Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot}, author={Joel Z. Leibo AND Edgar Du'e\~nez-Guzm'an AND Alexander Sasha Vezhnevets AND John P. Agapiou AND Peter Sunehag AND Raphael Koster AND Jayd Matyas AND Charles Beattie AND Igor Mordatch AND Thore Graepel}, year={2021}, journal={International conference on machine learning}, organization={PMLR} }

Disclaimer

This is not an officially supported Google product.

Document