5 features ROS 2 needs to add for robotics developers

Photo via The Construct

I’ve been using ROS 2 quite a bit over the past several months. As I’ve previously mentioned, it would appear there aren’t too many real robots running ROS 2 yet. We have a bit of a chicken-and-egg problem where the tools are not yet fully ready for real robots, but until people start using ROS 2 on real robots nobody knows the real pain points.

There are many things that could be done in ROS 2. But there is limited time to implement them all – so we need to focus on those that enable robots and their developers to “survive.”

I often get asked if ROS 2 is ready for primetime. My answer for a long time was “no.” I’m going to upgrade it to “maybe, depends on what you’re doing” at this point. Here are five things I think would make the answer “hell yes” for most roboticists. I actually hope this post ages poorly and that all these things come to happen in ROS 2.

Automatic QoS for RVIZ, rcl2cli

Quality of Service (QoS) is probably the biggest change between ROS 1 and ROS 2 – it’s also the one that causes the most headaches, from what I can tell. The ROS 2 Foxy release adds a –verbose option to the ROS 2 topic info command, which is a huge step in the right direction. This lets you quickly diagnose when a publisher and subscriber are using incompatible QoS.

rosbag 2 got a huge upgrade in ROS 2 Foxy: it automatically determines the proper settings for QoS so that it always connects to the publisher you’re trying to record (note: if multiple publishers are publishing to the same topic with different QoS it may not work – but really, who does that?).

Now we need that feature in RVIZ 2 and the command line utilities (CLI). These are debugging tools, so they need to be able to “just work” in most scenarios.

Since most of the time you’re using RVIZ 2 to connect to sensor data, which is often published with a non-default QoS (the sensor data profile), it’s absolutely bonkers that RVIZ uses the default QoS on everything (which is incompatible with sensor profile). Even something as simple as latched topics won’t work by default.

This is not an easy ask. It will involve significant changes to RVIZ as well as changes to lower level packages like message_filters, but I’m pretty sure this is the single biggest bang-for-your-buck improvement that will make ROS 2 work better for robot developers.

Documentation for ROS 2

OK, I’m sounding like a broken record (or the squeaky caster on your 8-year-old mobile manipulator), but this is really important.

I’m not just talking about the lack of tutorials here. One of the things that made ROS great for new developers in the 2011-2014 era (when it experienced huge growth in the community), was a very polished and up-to-date wiki. If you wanted to find out about a package, you could go to wiki.ros.org/package_name – and the documentation was right there (or if it wasn’t, you had a pretty good idea this package wasn’t ready for primetime). With ROS 2, we don’t have a centralized place for documentation yet – and I think that is holding the community growth back.

There is also the issue of “user documentation.” Nearly everything for ROS 2 is written assuming an expert programming background (even more so than ROS 1 documentation). Reading the source code is not how you’re supposed to learn how to run a ROS driver for a laser scanner.

Building out a community is super important. The best way to get a bug fixed is to find a developer who needs it fixed. I’ve only been using ROS 2 on-and-off for a couple months – and in that time I’ve fixed half a dozen bugs across multiple ROS 2 packages, and even taken on maintaining the ROS 2 port of urg_node and the related packages.

Subscriber connect callbacks

Now we’ll jump into a super technical issue – but the impact is huge – especially for those doing perception (which is, you know, generally a big part of robotics). When creating a publisher in ROS 1, you could register a callback that would get called whenever a subscriber connected or disconnected. This feature doesn’t yet exist in ROS 2, but I think it is essential for real robotics systems. Here’s why.

Robots can generate lots of sensor data – especially when you add processing pipelines into the mix. Sometimes you might need a high-resolution point cloud with color and depth information. Sometimes you need a low-res colorless point cloud. This is especially true when the robot system does multiple tasks. For instance, imagine a robot that is both mobile and a manipulator – for navigating the environment it wants that high frame rate, low-res point cloud for collision avoidance. When the mobile manipulator gets to the destination, it wants to switch to a high-res point cloud to decide what to grab.

Sometimes you literally can’t be publishing all the data streams possible because it would overwhelm the hardware (for instance, saturating the USB bus if you were to pull depth and color and IR from most RGBD sensors at the same time).

In ROS 1, you could create “lazy publishers” so that the creators of these intensive data types would only create and publish the data when someone was listening. They would be alerted to someone listening by the connect callback. The lack of lazy publishers throughout various drivers and the image_proc and depth_image_proc> packages is a real challenge to building high performance perception systems. When people ask me if ROS 2 is ready, my first question these days is “how much perception/vision are you doing?”

To be clear, there are workarounds available in some cases. If you’re creating a publisher yourself, you can:

Create a loop that “polls” whether there are subscribers (using get_subscription_count) as I did right now in the openni2_camera package.
Use parameters to dynamically reconfigure what is running. While this might work in some cases (and maybe even be a preferred solution for some use cases), it likely leads to a more brittle system.
Re-architect your system never needs lazy publishers by hard coding exactly what you need for a given robot. While some of this is likely to happen in a more production environment, it doesn’t lend itself to code reuse and sharing which was one of the major selling points of ROS 1.

Note that I said, “if you’re creating a publisher yourself”. There are lots of packages that are widely relied on in ROS 1 whose ROS 2 ports are crippled or broken due to the lack of subscriber connect callbacks: message_filters, image_transport, image_proc, depth_image_proc, and

Related answers.ros.org: ROS 2 publisher callback on subscription match

Developer involvement

In the month that I’ve been writing this post, a number of questions have been answered, so we’re already getting there!

I remember folks joking that ROS Answers was misnamed, because there were no answers there, just questions. It’s actually not true, but there’s very few answers if you search for the ROS 2 tag.

There are a lot of really good questions there. Like, stuff that’s not anywhere in the documentation and is probably quite relevant to a large number of users. Here’s a few examples:

ROS 2 developers, please take note: we’ve got lots of great features in this system, please help your users learn how to actually use them – maybe they’ll even help contribute back!

Your robot on ROS 2

There’s probably a bunch of other bugs/issues/etc hiding in the weeds. Your robot is probably not exactly the same as mine – and your use cases are going to be different. We need more robots running ROS 2 to dig into things. The good news is: you can install ROS 1 and ROS 2 on the same system and switch back and forth pretty easily.

About the Author

Michael Ferguson is Director of R&D at Cobalt Robotics, a leading provider of robotic security services based in San Mateo, Calif. Previously he was CTO of Fetch Robotics, leading the development of software and electronics for the Fetch and Freight robots.

Ferguson was also Co-Founder of Unbounded Robotics, a spin-off from Willow Garage. He began working with ROS in 2010 as a Software Engineer at Willow Garage. To follow Ferguson’s musings on robotics, check out his Robot & Chisel blog.

Comments

Ravi Reddy says

August 5, 2020 at 2:21 pm

Great article!
I am currently facing this problem if needing Subscriber connect callbacks. Because we are migrating from RoS1 to ROS2.

All great points raised by Michael, we would like to see it addressed in the subsequent updates as much as possible we should get started.
ThanknYou

Andrei says

August 18, 2020 at 7:25 am

About subscriber connect callbacks:

By default, ROS2 uses DDS as the underlying standardized middleware solution.
This functionality of listening subscribers is found in the DDS standard (https://www.omg.org/spec/DDS/1.4/PDF , pages 118 and 125 of the PDF), through a QoS setting called ‘liveliness’ and a mechanism called ‘listener’. So the technical solution already exists underneath, but the ROS2 developers have to integrate these DDS features to ROS2.

However, ROS2 uses DDS through an abstraction layer so it is possible to use something else than DDS as the underlying middleware.