Intel Corp. has been a strong supporter of research into artificial intelligence, machine learning, and computer vision, and two of its collaborations have implications for robots that operate in dynamic environments such as households. This week, Intel AI Lab and researchers at Oregon State University and the University of California, San Diego, presented papers that offer a new model for reinforcement learning and a massive dataset for training object recognition, respectively.
“We want to expand two approaches for machine learning to explore spaces with a more complex set of interactions,” said Hanling Tang, principal engineer in the AI Products Group at Intel. “Industry can do a lot to help design and fund large-scale data sets that touch on emerging areas.”
In reinforcement learning, machines can make choices to optimize potential rewards, or they can explore their environments to gather more data for more robust decision-making. At the 36th International Conference on Machine Learning this week, researchers from Oregon State and Intel AI Lab offered a solution to the exploit/explore challenge that they called “Collaborative Evolutionary Reinforcement Learning” (CERL).
Recognizing everyday objects such as doorknobs, switches, or mug handles is essential for robots to be able to interact with them. A major challenge for visual AI is adding context to large datasets.
Hao Su, an assistant professor at UC San Diego, and Subarna Tripathi, a deep learning data scientist at Intel AI Lab, this week presented a paper at the 2019 Conference on Computer Vision and Pattern Recognition introducing PartNet. They called it “the first large-scale dataset with fine-grained hierarchical, instance-level part annotations.”
CERL combines reinforcement learning methods
“We’re looking at scaling up algorithms from game scenarios to something with more real-world constraints,” said Somdeb Majumdar, deep learning data scientist at the Intel AI Lab. “There are problems to solve before deploying to physical robots interacting with humans.”
Instead of programming a computer or robot for every possible decision, traditional reinforcement learning for neural networks (also known as policy networks) uses policy gradient methods, Majumdar told The Robot Report. These favor policies with a higher probability of more immediate positive rewards.
Another approach is Evolutionary Algorithm (EA), which is inspired by natural evolution and is a population-based algorithm that selects strong candidates at each generation. However, EA takes more processing time because it evaluates candidates only after an exploration episode, explained the Intel and OSU resarchers.
CERL combines policy gradient and EA methods, allowing for more rapid machine learning while also handling rewards over longer timeframes. As a result, the researchers found that CERL solved standard academic benchmarks using fewer cumulative training samples than either method alone.
“This work with Oregon State goes back six to eight months ago,” said Majumdar. “In the last four months, we scaled up to multiple robots performing a single coordination task.”
“Every interaction had 10 candidates, because each robot solved a different part. They converged on a solution much faster,” said Majumdar. “Because of this population-based approach, we didn’t need to tune the neural network for behaviors. Figuring out how many layers or filters is usually a bit of a time sink, but we were able to offload the problem to the population.”
It also takes advantage of large CPU clusters rather than memory-limited GPUs. “The forward-propagation open loop is very CPU-intensive, and large populations are more effective at exploring space,” Majumdar said. “We’re taking advantage of different types of clusters that are available.”
“We’re working to port some algorithms from simulation into a humanoid robot to walk on different surfaces in Oregon,” Majumdar said. “As more robotics labs get involved, we can take advantage of the fact that lots of academic labs have physical robots to test algorithms on.”
“You can take this and apply it to any workload that applies to learning strategies, such as a chip layout problem,” Majumdar said. “You can formulate it as a Markovian game and apply a reinforcement learning solution. The impact of this type of population-based work goes beyond physical robotics.”
PartNet adds part-level understanding for 3D datasets
“Intel cares about different datasets,” said Tripathi. “Last year, we got to know Prof. Hao Su, whose lab focuses on computer vision, and we decided to create a large-scale, fine-grained dataset.”
“To sit on a chair is actually quite complicated,” said UC San Diego Prof. Hao Su. “It may not be facing you, and you need to pull it by an arm, and then you have to understand where the cushion is and what the back is. To interact with an object, a robot must have an understanding of its parts.
“In most virtual environments that roboticists are using, there’s a lack of complexity and scale,” he said. “We want to achieve AI with real-world understanding of complex geometries and physical properties.”
Most existing 3D shape datasets annotate components or parts of objects in a small number of instances or in a non-hierarchical manner. The UC San Diego and Intel researchers announced PartNet, which includes 573,585 semantically identified subcomponents for 26,671 shapes or object point clouds.
These annotations are connected to 24 categories of everyday objects such as lamps, doors, or chairs. In the case of a lamp, parts include “lamp shade,” “lightbulbs,” and “pull chain.” The paper proposes fine-grained, instance, and hierarchical segmentation to partition objects, distinguish parts from one another, and recognize types of parts, respectively.
“Recognizing not only an object’s identity but also its properties, parts, membership, and functionality of the parts will greatly impact the fields of computer vision, object recognition, and robot motion,” Hao Su said. “For a long time, computer vision and perception was a segmented community from planning control and actuation. PartNet could help provide a common infrastructure.”
Human annotation for smarter robots
To create and annotate the data set, the PartNet researchers, including Angel X Chang from Simon Fraser University, used hierarchical part templates and worked with 66 people.
“They built a knowledge based of geometry, physics, and semantic properties,” said Hao Su. “The project dates back to 2016, and we built on the ShapeNet project of 3D models. With a browser-based user interface, annotators could cut one piece into smaller ones or merge parts into a bigger one, following the hierarchical taxonomy. After annotation, there’s a sanity check or quality-control process, since we’re still finding some inconsistencies among humans.”
With the PartNet dataset, researchers can build large-scale simulated environments with objects — including their component parts and functions. Such simulations can be used to train robots on how to interact with objects such as a microwave oven.
“Task-aware grasping is currently based on geometry, but many robots don’t know the parts,” Hao Su said. “For example, it might not know to grab a mug by the handle, so it might have to deal with hot container.”
“PartNet is the first step to building a dynamic virtual environment, said Kaichun Mo, a Ph.D. candidate at the Stanford University AI Lab and first author on the research paper. “To open a locked door, a robot must first recognize the keyhole — it’s a challenging problem.”
“Then to open the door, the robot must understand the state change from locked to unlocked, which we’re currently working on,” he said. “Finally, the robot must figure out the physical details. How much force is needed? What is the door made out of? Where should it apply force?”
“PartNet will be the basis for the next step of annotations, on mobility and dynamic properties,” Hao Su said. “We’ve received e-mails asking for more properties. Some companies have unique data, and other universities have started using PartNet to fine-tune their data on recognition results.”
“Intel has done real groundbreaking work in reinforcement learning,” said Hao Su. “This could help with the progress of a domestic robot, which would be better able to take care of the young and old. It would need to understand all the objects in a home.”
“While PartNet is closed, we welcome international collaborators for usage,” said Kaichun Mo. The full PartNet paper, including a sample dataset and results, is available here.