The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe

NVIDIA’s RVT can learn new tasks after just 10 demos

By Brianna Wessling | June 30, 2023

NVIDIA Robotics Research has announced new work that combines text prompts, video input, and simulation to more efficiently teach robots how to perform manipulation tasks, like opening drawers, dispensing soap, or stacking blocks, in real life. 

Generally, methods of 3D object manipulation perform better when they build an explicit 3D representation rather than only relying on camera images. NVIDIA wanted to find a method of doing that came with less computing costs and was easier to scale than explicit 3D representations like voxels. To do so, the company used a type of neural network called a multi-view transformer to create virtual views from the camera input. 

The team’s multi-view transformer, Robotic View Transformer (RVT), is both scalable and accurate. RVT takes camera images and task language descriptions as inputs and predicts the gripper pose action. In simulations, NVIDIA’s research team found that just one RVT model can work well across 18 RLBench tasks with 249 task variations. 

The model can perform a variety of manipulation tasks in the real world with around 10 demonstrations per task. The team trained a single RVT model from real-world data and an RVT model from RLBench simulation data. In both settings, the single-trained RVT model was used to evaluate the performance on all tasks. 

The Team found that RVT had a 26% higher relative success rate than existing state-of-the-art models. RVT isn’t just more successful than other models, it can also learn faster than traditional models. NVIDIA’s model trains 36 times faster than PerAct, an end-to-end behavior-cloning agent that can learn a single-conditioned policy for 18 RLBench tasks with 249 unique variations, and achieves 2.3 times the inference speed of PerAct. 

While RVT was able to outperform similar models, it does come with some limitations that NVIDIA would like to look into further. For example, the team explored various view options for RVT and landed on an option that worked well across tasks, but in the future, the team would like to better optimize view specification using learned data. 

RVT, and explicit voxel-based methods, also require extrinsics to be calibrated from the camera to the robot base, and in the future, the team would like to explore extensions that remove this constraint. 

About The Author

Brianna Wessling

Brianna Wessling is an Associate Editor, Robotics, WTWH Media. She joined WTWH Media in November 2021, after graduating from the University of Kansas with degrees in Journalism and English. She covers a wide range of robotics topics, but specializes in women in robotics, autonomous vehicles, and space robotics.

She can be reached at bwessling@wtwhmedia.com

Tell Us What You Think! Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles Read More >

Six of multiple possible assistance scenarios with a prototype of a new robot being developed at MIT. Top row: getting into/out of a bathtub, bending down to reach objects, and catching a fall. Bottom row: powered sit-to-stand transition from a toilet, lifting a person from the floor, and walking assistance.
MIT engineers create elder assist robot E-BAR to prevent falls at home
The Northeastern team that won the MassRobotics Form & Function Challenge.
Northeastern soft robotic arm wins MassRobotics Form & Function Challenge at Robotics Summit
A FANUC robot working in car manufacturing.
U.S. automotive industry increased robot installations by 10% in 2024
A robot arm with a two-fingered gripper picking up a cup next to a sink.
Cornell University teaches robots new tasks from how-to videos in just 30 minutes

RBR50 Innovation Awards

“rr
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for Robotics Professionals.
The Robot Report Listing Database

Latest Episode of The Robot Report Podcast

Automated Warehouse Research Reports

Sponsored Content

  • Sager Electronics and its partners, logos shown here, will exhibit at the 2025 Robotics Summit & Expo. Sager Electronics to exhibit at the Robotics Summit & Expo
  • The Shift in Robotics: How Visual Perception is Separating Winners from the Pack
  • An AutoStore automated storage and retrieval grid. Webinar to provide automated storage and retrieval adoption advice
  • Smaller, tougher devices for evolving demands
  • Modular motors and gearboxes make product development simple
The Robot Report
  • Mobile Robot Guide
  • Collaborative Robotics Trends
  • Field Robotics Forum
  • Healthcare Robotics Engineering Forum
  • RoboBusiness Event
  • Robotics Summit & Expo
  • About The Robot Report
  • Subscribe
  • Contact Us

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe