Researchers at the U.S. Army Research Laboratory (ARL) and the Robotics Institute at Carnegie Mellon University developed a new technique to quickly teach robots novel traversal behaviors with minimal human oversight.
The technique allows mobile robots to navigate autonomously in environments while carrying out actions a human would expect of the robot in a given situation. ARL researcher Dr. Maggie Wigness says one of the goals to provide more reliable autonomous robot teammates to the soldiers.
“If a robot acts as a teammate, tasks can be accomplished faster and more situational awareness can be obtained,” Wigness said. “Further, robot teammates can be used as an initial investigator for potentially dangerous scenarios, thereby keeping Soldiers further from harm.”
To achieve this, Wigness said the robot must be able to use its learned intelligence to perceive, reason and make decisions.
“This research focuses on how robot intelligence can be learned from a few human example demonstrations,” Wigness said. “The learning process is fast and requires minimal human demonstration, making it an ideal learning technique for on-the-fly learning in the field when mission requirements change.”
Using inverse reinforcement learning
The researchers focused their initial investigation on learning robot traversal behaviors with respect to the robot’s visual perception of terrain and objects in the environment. The robot was taught how to navigate from various points while staying near the edge of a road, and also how to traverse covertly using buildings as cover.
According to the researchers, given different mission tasks, the most appropriate learned traversal behavior can be activated during robot operation. This is done by leveraging inverse optimal control, also commonly referred to as inverse reinforcement learning, a class of machine learning that seeks to recover a reward function given a known optimal policy.
In this case, a human demonstrates the optimal policy by driving a robot along a trajectory that best represents the behavior to be learned. These trajectory exemplars are then related to the visual terrain/object features, such as grass, roads and buildings, to learn a reward function with respect to these environment features.
“We seek to create intelligent robotic systems that reliably operate in warfighter environments, meaning the scene is highly unstructured, possibly noisy, and we need to do this given relatively little a priori knowledge of the current state of the environment,” Wigness said. “The fact that our problem statement is so different than so many other researchers allows ARL to make a huge impact in autonomous systems research. Our techniques, by the very definition of the problem, must be robust to noise and have the ability to learn with relatively small amounts of data.”
According to Wigness, this preliminary research has helped the researchers demonstrate the feasibility of quickly learning an encoding of traversal behaviors.
“As we push this research to the next level, we will begin to focus on more complex behaviors, which may require learning from more than just visual perception features,” Wigness said. “Our learning framework is flexible enough to use a priori intel that may be available about an environment. This could include information about areas that are likely visible by adversaries or areas known to have reliable communication. This additional information may be relevant for certain mission scenarios, and learning with respect to these features would enhance the intelligence of the mobile robot.”
Transferring behavior learning to mobile platforms
The researchers are also exploring how this type of behavior learning transfers between different mobile platforms. Their evaluation to date has been performed with a small unmanned Clearpath Husky robot, which has a visual field of view that is relatively low to the ground.
“Transferring this technology to larger platforms will introduce new perception viewpoints and different platform maneuvering capabilities,” Wigness said. “Learning to encode behaviors that can be easily transferred between different platforms would be extremely valuable given a team of heterogeneous robots. In this case, the behavior can be learned on one platform instead of each platform individually.”
This research is funded through the Army’s Robotics Collaborative Technology Alliance, or RCTA, which brings together government, industrial and academic institutions to address research and development required to enable the deployment of future military unmanned ground vehicle systems ranging in size from man-portables to ground combat vehicles.
Ultimately, this research is crucial for the future battlefield, where Soldiers will be able to rely on robots with more confidence to assist them in executing missions.
“The capability for the Next Generation Combat Vehicle to autonomously maneuver at optempo in the battlefield of the future will enable powerful new tactics while removing risk to the Soldier,” said ARL researcher Dr. John Rogers. “If the NGCV encounters unforeseen conditions which require teleoperation, our approach could be used to learn to autonomously handle these types of conditions in the future.”