What do popular games like Jenga and Pick Up Sticks have in common with training a robot to grasp and manipulate objects in the real world? The answer comes in an “active perception” project at the Australian Centre for Robotic Vision that has literally left other global research standing still in the complex task of visual grasp detection in real-world clutter.
“The idea behind it is actually quite simple,” said Ph.D. researcher Doug Morrison, who in 2018 created an open-source GG-CNN network enabling robots to more accurately and quickly grasp moving objects in cluttered spaces. “Our aim at the Centre is to create truly useful robots able to see and understand like humans. So, in this project, instead of a robot looking and thinking about how best to grasp objects from clutter while at a standstill, we decided to help it move and think at the same time.”
“A good analogy is how we humans play games like Jenga or Pick Up Sticks,” he said. “We don’t sit still, stare, think, and then close our eyes and blindly grasp at objects to win a game. We move and crane our heads around, looking for the easiest target to pick up from a pile.”
Stepping away from a static camera
As outlined in a research paper presented at the 2019 International Conference on Robotics and Automation (ICRA) in Montreal, the project’s active perception approach is the first in the world to focus on real-time grasping by stepping away from a static camera position or fixed data collecting routines.
It is also unique in the way it builds up a map of grasps in a pile of objects, which continually updates as the robot moves. This real-time mapping predicts the quality and pose of grasps at every pixel in a depth image, all at a speed fast enough for closed-loop control at up to 30Hz.
“The beauty of our active perception approach is that it’s smarter and at least 10 times faster than static, single viewpoint grasp detection methods,” Morrison said. “We strip out lost time by making the act of reaching towards an object a meaningful part of the grasping pipeline rather than just a mechanical necessity.
“Like humans, this allows the robot to change its mind on the go in order to select the best object to grasp and remove from a messy pile of others,” he added.
Morrison has tested and validated his active perception approach at the center’s laboratory at Queensland University of Technology (QUT). Trials involved using a robotic arm to “tidy up” 20 objects, one at a time, from a pile of clutter. His approach achieved an 80% success rate when grasping in clutter; more than 12% in comparison with traditional single-viewpoint grasp-detection methods.
Morrison said he was especially proud of developing the Multi-View Picking (MVP) controller, which selects multiple informative viewpoints for an eye-in-hand camera while reaching to a grasp, revealing high-quality grasps hidden from a static viewpoint.
“Our approach directly uses entropy in the grasp pose estimation to influence control, which means that by looking at a pile of objects from multiple viewpoints on the move, a robot is able to reduce uncertainty caused by clutter and occlusions,” said Morrison. “It also feeds into safety and efficiency by enabling a robot to know what it can and can’t grasp effectively. This is important in the real world, particularly if items are breakable, like glass or china tableware messily stacked in a washing-up tray with other household items.”
The next step for Morrison, as part of the center’s “Grasping With Intent” project funded by a $70,000 (U.S.) Amazon Research Award, is moving from safe and effective grasping into the realm of meaningful vision-guided robotic manipulation.
“In other words, we want a robot to not only grasp an object, but do something with it; basically, to usefully perform a task in the real world,” he said. “Take for example, setting a table, stacking a dishwasher, or safely placing items on a shelf without them rolling or falling off.”
Active perception and adversarial shapes
Morrison has also set his sights on fast-tracking how a robot actually learns to grasp physical objects. Instead of using typical household items, he said he wants to create a truly challenging training data set of adversarial shapes.
“It’s funny because some of the objects we’re looking to develop in simulation could better belong in a futuristic science fiction movie or alien world — and definitely not anything humans would use on planet Earth!” said Morrison.
There is, however, method in this scientific madness. Training robots to grasp items designed for people is not efficient or beneficial for a robot.
“At first glance, a stack of ‘human’ household items might look like a diverse data set, but most are pretty much the same,” Morrison explained. “For example cups, jugs, flashlights and many other objects all have handles, which are grasped in the same way and do not demonstrate difference or diversity in a data set.”
“We’re exploring how to put evolutionary algorithms to work to create new, weird, diverse and different shapes that can be tested in simulation and also 3D printed,” he said. “A robot won’t get smarter by learning to grasp similar shapes. A crazy, out-of-this world data set of shapes will enable robots to quickly and efficiently grasp anything they encounter in the real world.”
Researchers from the Australian Centre for Robotic Vision this week also led workshops, including a focus on autonomous object manipulation at the International Conference on Intelligent Robots and Systems (IROS 2019) in Macau, China. They discussed why robots, like humans, can suffer from overconfidence in a workshop on the importance of uncertainty for deep learning in robotics. In addition, the center has announced a Robotic Vision Challenge to help robots sidestep the pitfalls of overconfidence.
The Robot Report is launching the Healthcare Robotics Engineering Forum, which will be on Dec. 9-10 in Santa Clara, Calif. The conference and expo will focus on improving the design, development, and manufacture of next-generation healthcare robots. Learn more about the Healthcare Robotics Engineering Forum, and registration is now open.
About the Australian Centre for Robotic Vision
The Australian Centre for Robotic Vision is an ARC Centre of Excellence, funded for $25.6 million over seven years. It claims to be the largest collaborative group of its kind generating internationally impactful science and new technologies to transform important Australian industries and solve some of the hard challenges facing Australia and the globe.
Formed in 2014, the Australian Centre for Robotic Vision said it is the world’s first research center specializing in robotic vision. Its researchers are on a mission to develop new robotic vision technologies to expand the capabilities of robots. They intend to give robots the ability see and understand, so they can improve sustainability for people and the environments we live in.
The Australian Centre for Robotic Vision has assembled an interdisciplinary research team from four leading Australian research universities: QUT, The University of Adelaide (UoA), The Australian National University (ANU), and Monash University. It includes the Commonwealth Scientific and Industrial Research Organization’s (CSIRO) Data61 and overseas institutions, such the French national research institute for digital sciences (INRIA), Georgia Institute of Technology, Imperial College London, the Swiss Federal Institute of Technology Zurich (ETH Zurich), University of Toronto, and the University of Oxford.