【深度强化学习竞赛】BASALT2021(微软、OpenAI赞助)

实验室官方助手

我们为 MineRL 家族增加了一项新竞赛：BASALT，这是一项关于解决人工判断任务的竞赛，奖金为 11,000 美元。本次比赛中的任务没有预定义的奖励函数：目标是产生由真人判断可有效解决给定任务的轨迹。这对于 ML 社区来说是一个未知的领域，它将需要一套不同的规范和培训程序——也许将演示与现场人类排名、评级或比较的来源相结合，以指导代理朝着正确的方向前进。我们希望这场竞赛可以推动研究界建立这些新程序，随着我们希望人工智能系统融入我们生活的更多领域，我们预计这些程序将变得越来越重要。

与Diamond 比赛一样，BASALT 提供了一组与人类演示配对的 Gym 环境，因为基于模仿的方法是解决难以指定的任务的重要组成部分。

竞赛主页： https://minerl.io/basalt/

注册参加比赛： Sign-up to participate on [AIcrowd]!(好像需要科学上网)

The Tasks

FindCave

The agent should search for a cave, and terminate the episode when it is inside one.

MakeWaterfall

After spawning in a mountainous area, the agent should build a beautiful waterfall and then reposition itself to take a scenic picture of the same waterfall.

CreateVillageAnimalPen

After spawning in a village, the agent should build an animal pen containing two of the same kind of animal next to one of the houses in a village.

BuildVillageHouse

Using items in its starting inventory, the agent should build a new house in the style of the village, in an appropriate location (e.g. next to the path through the village), without harming the village in the process.

Competition Overview

All submissions are through AIcrowd. There you can find detailed rules as well as the leaderboard.

Submission: Submit Trained Agents

Participants train agents to solve BASALT tasks. Participants submit both the training code as well as already-trained models for evaluation.

Baseline submission

Our baseline is a simple behavioral cloning algorithm trained for a couple of hours. We hope to see participants improve upon it significantly!

Tzy2020

实验室可以组织一下大家，以高校+实验室名义去参加比赛，就是不知道参加的人多不多

Quantum-Cheese

Tzy2020 想参加，希望能组织一下组个队～

Document