The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe

NVIDIA 6-DoF pose estimation trained on synthetic data

By Steve Crowe | October 29, 2018


Knowing the 3D position and orientation of objects, often referred to as 6-DoF pose, is a key component to robots being able to manipulate objects that aren’t in the same place every time. NVIDIA researchers have developed a deep learning system, trained on synthetic data, that can do just that using one RGB camera.

NVIDIA said its Deep Object Pose Estimation (DOPE) system, which was introduced this morning at the Conference on Robot Learning (CoRL) in Zurich, Switzerland, is another step toward enabling robots to work effectively in complex environments. Read the paper “Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects” for more in-depth detail.

Stan Birchfield, a Principal Research Scientist at NVIDIA, told The Robot Report that with NVIDIA’s algorithm and a single image, a robot can infer the 3D pose of an object for the purpose of grasping and manipulating it. Synthetic data has the advantage over real data in that it is possible to generate an almost unlimited amount of labeled training data for deep neural networks.

“Real data needs to be annotated by hand. It’s very hard for a non-expert to label these images,” Birchfield said. “We’ve been looking at how to train networks with synthetic data only for some time.”

One of the key challenges of synthetic data, NVIDIA said, is the ability to bridge the “reality gap” so that networks trained on synthetic data operate correctly with real-world data. NVIDIA said its one-shot deep neural network, albeit on a limited basis, has accomplished that. Using NVIDIA Tesla V100 GPUs on a DGX Station, with the cuDNN-accelerated PyTorch deep learning framework, the researchers trained a deep neural network on synthetic data generated by a custom plugin developed by NVIDIA for Unreal Engine, which is publicly available for other researchers.

“Specifically, we use a combination of non-photorealistic domain randomized (DR) data and photorealistic data to leverage the strengths of both,” NVIDIA researchers wrote in their paper. “These two types of data complement one another, yielding results that are much better than those achieved by either alone. Synthetic data has an additional advantage in that it avoids overfitting to a particular dataset distribution, thus producing a network that is robust to lighting changes, camera variations, and backgrounds.”

Figure 3: Pose estimation of YCB objects on data showing extreme lighting conditions. TOP: PoseCNN [5], which was trained on a mixture of synthetic data and real data from the YCB-Video dataset [5], struggles to generalize to this scenario captured with a different camera, extreme poses, severe occlusion, and extreme lighting changes. BOTTOM: NVIDIA’s DOPE method generalizes to these extreme real-world conditions even though it was trained only on synthetic data; all objects are detected except the occluded soup can (2nd column) and three dark cans (3rd column).

Testing NVIDIA’s system

The system approaches its grasps in two steps. First, the deep neural network estimates belief maps of 2D keypoints of all the objects in the image coordinate system. Next, peaks from these belief maps are fed to a standard perspective-n-point (PnP) algorithm to estimate the 6-DoF pose of each object instance.

To put its pose estimation system to the test, NVIDIA attached a Logitech C960 RGB camera to the waist of a Baxter two-armed cobot from Rethink Robotics. The Logitech camera was calibrated to the robot base using a standard checkerboard target visible to both the Logitech camera as well as the wrist camera. The parallel jaw gripper moves from an opening of approximately 10 cm to 6 cm, or from 8 cm to 4 cm, depending on the thickness of the rubber tips installed.

The researchers used five objects, placed among clutter, in four different locations on a table in front of the robot, in three different orientations at each location. The Baxter robot was instructed to move to a pre-grasp point above the object, then execute a top-down grasp, resulting in 12 trials per object. Of those 12 attempts, here is the number of successful grasps per object: 10 (cracker), 10 (meat), 11 (mustard), 11 (sugar), and 7 (soup).

NVIDIA said the round shape of the soup can caused some issues with the top-down grasps. When the researchers repeated the experiment with the can of soup lying on its side, the number of successful grasps increased to 9 of 12 attempts.

Rethink Robotics closed its doors on October 3. The IP has since been acquired by HAHN Group, a German automation specialist that will continue to manufacture and sell the Sawyer cobot. We asked Birchfield for his thoughts on the Baxter robot.

“As a researcher, we’ve been very happy with Baxter. It has a large amount of capability for the price,” said Birchfield. “Baxter doesn’t know the company went out of business. But our robotics lab has a variety of robots that will enable us to test different robots going forward.”

Next steps for NVIDIA

At press time, Birchfield said the system was only trained on those five objects. The researchers are working off the well-known Yale-CMU-Berkeley (YCB) Object and Model Set, which consists of 77 everyday items. Birchfield said there is no limit to the number of objects the system can detect, but the researchers “took a subset that represents a variety of different sizes and shapes that are easily accessible for people to go to the store and try out.”

Birchfield said this system will enable other robotics developers to get a jumpstart on their projects by solving a key part of the perception problem.

“Robotics is such a multi-disciplinary field that researchers have a challenge in from of them because of time,” Birchfield said. “Often times with perception, folks will use AR tags to help solve that problem. Our technology will help them get one step closer to the real world without using AR tags.”

NVIDIA said the next steps are to increase the number of detectable objects, handle symmetry and incorporate closed-loop refinement to increase grasp success.

About The Author

Steve Crowe

Steve Crowe is Executive Editor, Robotics, WTWH Media, and chair of the Robotics Summit & Expo and RoboBusiness. He is also co-host of The Robot Report Podcast, the top-rated podcast for the robotics industry. He joined WTWH Media in January 2018 after spending four-plus years as Managing Editor of Robotics Trends Media. He can be reached at scrowe@wtwhmedia.com

Tell Us What You Think! Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles Read More >

De-Racking Simulation and Training.
Apera AI updates Apera Forge design and AI training studio
Two cobot arms putting a gear on a bike tire.
TRI: pretrained large behavior models accelerate robot learning
Outrider has designed safety systems for autonomous yard trucks, with mockup shown here.
Outrider designs safety system for autonomous yard trucks
David Reger, founder and CEO of NEURA Robotics and Marco Santos, global CEO of GFT.
GFT Technologies and NEURA Robotics partner to build software for physical AI

RBR50 Innovation Awards

“rr
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for Robotics Professionals.
The Robot Report Listing Database

Latest Episode of The Robot Report Podcast

Automated Warehouse Research Reports

Sponsored Content

  • How to Set Up a Planetary Gear Motion with SOLIDWORKS
  • Sager Electronics and its partners, logos shown here, will exhibit at the 2025 Robotics Summit & Expo. Sager Electronics to exhibit at the Robotics Summit & Expo
  • The Shift in Robotics: How Visual Perception is Separating Winners from the Pack
  • An AutoStore automated storage and retrieval grid. Webinar to provide automated storage and retrieval adoption advice
  • Smaller, tougher devices for evolving demands
The Robot Report
  • Automated Warehouse
  • RoboBusiness Event
  • Robotics Summit & Expo
  • About The Robot Report
  • Subscribe
  • Contact Us

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe