Facebook researchers released the NetHack Learning Environment, a research tool for improving the robustness and generalization of reinforcement learning agents.

Games have served as benchmarks for AI from years. However, things got more fuel in the year 2013 when Google subsidiary DeepMind demonstrated an AI system that can play games like Pong, Breakout, Space Invaders, Seaquest, Beamrider, Enduro, and Q*bert at superhuman levels. These advancements in gaming aren’t only developing gaming design. According to DeepMind cofounder Demis Hassabis, these developments are informing the development of systems that might someday diagnose illnesses and predict complicated protein structures as well as segment CT scans.

Facebook-NetHack.png

NetHack, released in the year 1987, is more sophisticated than it seems! It assigns tasks to players with descending more than 50 dungeon levels to retrieve a magical amulet. During this strenuous task, the players must use the hundreds of items provided and fight monsters while contending with productive interactions between the two. The levels in NetHack game are procedurally generated, and each game is different, which, according to the Facebook researchers, tests the generalization limits of the current state-of-the-art AI.

NetHack though has another advantage in its lightweight architecture. It comprises of a turn-based, ASCII-art world and a game engine, written primarily in C that captures its complexity. The engine forgoes all but the simplest of physics while rendering symbols instead of pixels. This allows models to learn and absorb quickly without wasting computational resources on simulating dynamics or rendering observations.

In the present world, training sophisticated machine learning models in the cloud is highly expensive. As per a study report, the University of Washington’s Grover, which is tailored for both generation and detection of fake news, costs $25,000 to train in two weeks. The OpenAI racks up $256 per hour to train its GPT-2 language model, and Google spent an estimated sum of $6,912 training BERT.

In contrast, a single high-end graphics card is enough to train AI-driven NetHack agents using the TorchBeast framework. The framework supports scaling by adding more advanced graphics cards or machines. Through this, agents can also experience several steps in the environment in a reasonable time frame while continually challenging the limits of what current AI techniques can achieve. “NetHack presents a challenge that’s on the frontier of current methods, without the computational costs of other challenging simulation environments. Standard deep [reinforcement learning] agents currently operating on NetHack explore only a fraction of the overall game of NetHack,” Facebook researchers stated. “Progress in this challenging new environment will require [reinforcement learning] agents to move beyond tabula rasa learning.”

The NetHack Learning Environment has three primary components:

  • A Python interface to NetHack using the accessible OpenAI Gym API,
  • A suite of benchmark tasks, and
  • A baseline agent.

It also includes seven benchmark tasks that are designed to measure agents’ progress.

The co-authors of NetHack note that it contains a humongous body of external resources, which they expect will be used to improve the gaming performance. “We believe that the NetHack Learning environment will inspire further research on robust exploration strategies in [reinforcement learning], planning with long-term horizons, and transferring commonsense knowledge from resources outside of the simulation,” the researchers wrote. “[It] provides … agents with plenty of experience to learn from so that we as researchers can spend more time testing new ideas instead of waiting for results to come in. Also, we believe it democratizes access for researchers in more resource-constrained labs without sacrificing the difficulty and richness of the environment.”