Deep Teaching from Helm.ai designed to ease autonomous vehicle training

For robots and vehicles to become more autonomous, developers are looking for ways to build artificial intelligence that require less data and laborious annotation. Helm.ai Inc. last month announced “Deep Teaching,” which it described as a new methodology to train neural networks without human annotation or simulation.

The Menlo Park, Calif.-based startup claimed that Deep Teaching can deliver computer vision performance faster and more accurately than current methods. Helm.ai added that it can train on vast volumes of data more efficiently without needing large-scale fleets or numerous human annotators.

“Traditional AI approaches that rely upon manually annotated data are wholly unsuited to meet the needs of autonomous driving and other safety-critical systems that require human-level computer vision accuracy,” said Vlad Voroninski, CEO of Helm.ai. “Deep Teaching is a breakthrough in unsupervised learning that enables us to tap into the full power of deep neural networks by training on real sensor data without the burden of human annotation nor simulation.”

“The market price of annotation is dollars per image, and one vehicle can collect tens of millions of images per day,” he told The Robot Report. “Humans don’t just learn to drive through practice; we already understand many things from operating in the world, and we can readily interpret new scenarios previously unseen while driving.”

Deep Teaching generalizes to new scenarios

In the first use case of Helm.ai’s Deep Teaching technology, it trained a neural network to detect lanes on tens of millions of images from thousands of different dashcam videos from across the world without any human annotation or simulation. It was then able to handle corner cases well known to be difficult in the autonomous driving industry, such as rain, fog, glare, faded o missing lane markings, and various illumination conditions.

Helm.ai said that it was able to using this neural network to surpass public computer vision benchmarks with minimal engineering effort and a fraction of the cost and time required by traditional deep learning methods.

“We’ve developed the ability to train on raw sensor data without annotation or simulation,” Voroninski said. “By reducing the capital cost of learning from more images, we get more accurate results and more generalizable artificial intelligence.”

In addition, Helm.ai has built a full stack of software, enabling a vehicle to steer autonomously on steep and curvy mountain roads using only one camera and one GPU but no maps, no lidar, and no GPS. The system worked without prior training on data from these roads, said the company.

“A typical self-driving stack includes sensor data, a perception layer that interprets that data, an intention-prediction model that understands how agents might react in future, a path-planning module, and a vehicle-control stack to implement decisions,” Voroninski explained. “The control part is more or less solved, but quite a lot of heavy lifting happens at the perception and intent-prediction steps.”

“When we first entered this space, we examined approaches that other companies were taking,” he said. “Traditional AI is not enough. A lot of research and development has been needed to get to the capabilities of Helm.ai today, and we had some unique advantages from merging our experience with applied mathematics and compressive sensing with our understanding of deep learning. At Helm, we have a small team of people with top skills in AI R&D focused on building a product.”

Since then, Helm.ai has applied Deep Teaching to semantic segmentation for dozens of object categories, monocular vision depth prediction, pedestrian intent modeling, lidar-vision fusion, and automation of HD mapping.

Benchmarks and awards

Helm.ai claimed that its Deep Teaching system has surpassed state-of-the-art production systems in performance benchmarks, noting that it has received recognition at Tech.AD Detroit.

“The metric of number of miles driven or how much fleet data is collected doesn’t indicate success,” said Voroninski. “Proving that the perception stack is able to make the right decisions is harder to convey. By training on large datasets with a wide variety of adversarial scenarios, we achieved generalization to handle corner cases out of the box.”

“We wanted to put our system under the same constraints as a production system,” he said. “We didn’t want to overfit to a particular scenario, and since we can’t control where a vehicle is driven in a production system, we tried the system in entirely new scenarios.”

Safety and L2 to L4 vehicles

AI and machine vision applications such as Web searches or parts inspections are not as time- and safety-critical as autonomous vehicles, said Helm.ai. The company said that its approach to “economical training on huge datasets of images and other sensor data” will benefit the self-driving car industry.

“Helm.ai’s self-driving technologies are uniquely suited to deliver on the potential of autonomous driving,” said Quora CEO Adam D’Angelo. “I look forward to the advances the team will continue to make in the years to come and am excited to have invested in the company.”

At the same time, Helm.ai is focusing on advanced driver-assist systems (ADAS) rather than Level 5 or fully autonomous vehicles. “We don’t depend on breakthroughs in sensor hardware modalities,” said Voroninski. “Being able to approach the capability of the human eye from a camera perspective is great, but the bottleneck is on the inference side, in interpreting sensor data.”

Helm.ai’s demonstrations have used a single camera, but other sensors could be helpful on the path to autonomy, Voroninski acknowledged.

“For example, radar gives more redundancy and robustness in rain, snow, or fog,” he said. “Lidar measures depth accurately but is susceptible to a host of other issues, including a tendency to bounce off of dust clouds or car exhaust. In order to actually figure out which lidar returns are relevant, you’d have to use vision anyway, which is why we believe training neural networks with computer vision via unsupervised learning is the most effective way to achieve truly scalable autonomous systems.”

Other opportunities for Deep Teaching

In addition to autonomous vehicles, Deep Teaching could be useful in aviation, robotics, manufacturing, and retail, said Helm.ai.

“We didn’t know how generalized Deep Teaching could be, but as we developed the technology, we discovered it was quite general,” said Voroninski. “It doesn’t matter to us object category we train for, we can train for any of them.”

“There are opportunities for Helm in safety-critical systems that interact with the world and necessitate a high-level AI stack,” he said. “We are already working with several automotive manufacturers and fleets.”

Helm.ai raised $13 million in seed funding in March, before the COVID-19 pandemic significantly affected the U.S.

“The vast majority of what we do is software development, so we can be effective remotely,” Voroninski said. “We can test on live vehicles. The situation has highlighted the need for automation, which will speed up. But by the time robotaxis actually launch at scale, hopefully, COVID won’t be an issue by then.”

“Our value proposition to the ecosystem is stable — providing high-value autonomy software,” he said.

Helm.ai applies ‘Deep Teaching’ to Level 2 to 4 autonomous vehicles