Listen to this article
Robotics developers have found more and more ways to use generative artificial intelligence (AI) models, like ChatGPT and GitHub GoPilot, to develop robots. In just the last few months, Microsoft and Siemens recently announced they would be using generative AI to enhance factory automation, Orbbec released a 3D camera SDK that uses ChatGPT, and OpenAI released APIs for production use of ChatGPT.
Unlocking embodied AI
Recent breakthroughs in AI could make embodied AI possible, according to Kendall. Embodied AI aims to solve AI problems for virtual robots that can interact in the virtual world with other robots, which could then be transferred to real robots.
Kendall highlighted five advancements in autonomous driving that could be unlocked with the recent AI breakthroughs: generalization, performance, understanding & reasoning, human-machine interaction, and remote assistance.
AV developers have a long list of driving scenarios and edge cases that they need to prepare their AI for, and the capabilities of foundation models can help them to tackle those cases.
Generative AI models are able to use their foundation intelligence to reason about situations in a generalized way. This general-purpose reasoning could make it possible for vehicles to be prompted to drive in any scenario and edge cases without prior experience, Kendall said.
Wayve.AI is currently exploring ways of accelerating its roadmap by using other domain-agnostic data sources for pre-training models, like text data. The company doesn’t see this text data as a replacement for on-road testing or other data used for safety validation but as a supplement to its training data corpus. This includes a mix of on-road expert driving data, fleet data supplied by its fleet partners, and simulated and re-simulated off-road data.
“Discovering new ways to pre-train foundation models and learn robustness through other data sources can reduce our fleet data requirements and enable us to train models faster,” Kendall said.
Understanding & Reasoning
Wayve.AI is also doing research to understand how its AI models are making sense of the world and how they make decisions to drive through it safely. Generative AI allows the company to use natural language and generative techniques to interrogate and understand AI models.
“We’re pioneering ways for our AI to answer questions in natural language, render a video of what it expects to happen next, or even reason about counterfactual changes to the scene,” Kendall said.
By aligning robots and natural language, Wayve.AI is able to give instructions to the AV in a conversational manner, which opens up possibilities for the company to ‘backseat drive’ a car, personalize the driving experience, or provide more flexibility in service.
In the future, this could allow Wayve.AI to have its AV safety operators provide feedback in real-time, and use its feedback to help the model better align with human expectations to improve trust and safety.
Aligning language representations with Wayve.AI’s Driver will also allow the company to send text prompts to the model. These prompts could explain specific behaviors in much greater detail than sending space and time coordinates or driving commands. This could help increase the speed and accuracy of assisting AVs in difficult situations remotely.