Listen to this article
Intel Labs collaborated with the Computer Vision Center in Spain, Kujiale in China, and the Technical University of Munich to develop the Simulator for Photorealistic Embodied AI Research (SPEAR). The result is a highly realistic, open-source simulation platform that accelerates the training and validation of embodied AI systems in indoor domains. The solution can be downloaded under an open-source MIT license.
Existing interactive simulators have limited content diversity, physical interactivity, and visual fidelity. This realistic simulation platform allows developers to train and validate embodied agents for growing tasks and domains.
The goal of SPEAR is to drive research and commercialization of household robotics through the simulation of human-robot interaction scenarios.
It took more than a year with a team of professional artists to construct a collection of high-quality, handcrafted, interactive environments. The SPEAR starter pack features more than 300 virtual indoor environments with more than 2,500 rooms and 17,000 objects that can be manipulated individually.
These interactive training environments use detailed geometry, photorealistic materials, realistic physics, and accurate lighting. New content packs targeting industrial and healthcare domains will be released soon.
The use of highly detailed simulation enables the development of more robust embodied AI systems. Roboticists can leverage simulated environments to train AI algorithms and optimize perception functions, manipulation, and spatial intelligence. The ultimate outcome is faster validation and a reduction in time-to-market.
In embodied AI, agents learn from physical variables. Capturing and collating these encounters can be time-consuming, labor-intensive, and risky. The interactive simulations provide an environment to train and evaluate robots before deploying them in the real world.
Overview of SPEAR
SPEAR is designed based on three main requirements:
- Support a large, diverse, and high-quality collection of environments
- Provide sufficient physical realism to support realistic interactions and manipulation of a wide range of household objects
- Offer as much photorealism as possible, while still maintaining enough rendering speed to support training complex embodied agent behaviors
At its core, SPEAR was implemented on top of the Unreal Engine, which is an industrial-strength open-source game engine. SPEAR environments are implemented as Unreal Engine assets, and SPEAR provides an OpenAI Gym interface to interact with environments via Python.
SPEAR currently supports four distinct embodied agents:
- OpenBot Agent – well-suited for sim-to-real experiments, it provides identical image observations to a real-world OpenBot, implements an identical control interface, and has been modeled with accurate geometry and physical parameters
- Fetch Agent – modeled using accurate geometry and physical parameters, Fetch Agent is able to interact with the environment via a physically realistic gripper
- LoCoBot Agent – modeled using accurate geometry and physical parameters, LoCoBot Agent is able to interact with the environment via a physically realistic gripper
- Camera Agent – which can be teleported anywhere within the environment to create images of the world from any angle
The agents return photorealistic robot-centric observations from camera sensors, odometry from wheel encoder states as well as joint encoder states. This is useful for validating kinematic models and predicting the robot’s operation.
For optimizing navigational algorithms, the agents can also return a sequence of waypoints representing the shortest path to a goal location, as well as GPS and compass observations that point directly to the goal. Agents can return pixel-perfect semantic segmentation and depth images, which is useful for correcting for inaccurate perception in downstream embodied tasks and gathering static datasets.
SPEAR currently supports two distinct tasks:
- The Point-Goal Navigation Task randomly selects a goal position in the scene’s reachable space, computes a reward based on the agent’s distance to the goal, and triggers the end of an episode when the agent hits an obstacle or the goal.
- The Freeform Task is an empty placeholder task that is useful for collecting static datasets.
SPEAR is available under an open-source MIT license, ready for customization on any hardware. For more details, visit the SPEAR GitHub page.