The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe

DiffuseDrive addresses data scarcity for robot and AI training

By Eugene Demaitre | August 1, 2025

DiffuseDrive builds photorealistic imagery such as this from real-world data sets.

DiffuseDrive builds photorealistic imagery such as this from real-world data sets. Source: DiffuseDrive

Robots and artificial intelligence need copious amounts of data to train on, and if that data is synthetic, it needs to be as realistic as possible. Capturing real-world data can be expensive and time-consuming, while simulation-based data typically came from game engines and led to sim-to-real gaps. DiffuseDrive Inc. claimed that its generative AI platform evaluates existing data, identifies what is missing, and uses proprietary diffusion models to create photorealistic data.

Balint Pasztor, an engineer, and Roland Pinter, a physicist, founded DiffuseDrive in 2023 after meeting at Bosch. They then relocated the company from Hungary to San Francisco.

“We previously worked on Level 4 autonomous driving for Porsche,” Pasztor toldĀ The Robot Report. “Data scarcity is the missing piece to solving the puzzle of physical AI, which spans manufacturing, monitoring, agriculture, and aerospace.”

DiffuseDrive founders Roland Pinter (left) and Balint Pasztor (right).

DiffuseDrive co-founders: CTO Roland Pinter (left) and CEO Balint Pasztor (right).

AI needs data specific to the domain

“Industry has been using the same models since the early 2010s, and automakers and robotics developers don’t have enough realistic data covering their operational design domains,” said Pasztor, who is now CEO of DiffuseDrive.

“Synthetic data from simulations wasn’t realistic enough for safety or mission-critical functions,” he added. “We needed AI-generated data that was indistinguishable from real life.”

Even at this year’s IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), people in the space were scoring only 50%, he recalled. “They were just guessing,” Pasztor said.

Commercial robotics applications require high amounts of relevant data. Self-driving vehicles and item recognition for e-commerce picking have known and growing data sets, but automation can flexibly serve many more applications — if it is properly trained.

DiffuseDrive identifies, understands gaps to fill

DiffuseDrive can bridge the simulation-to-reality gap by generating suggestions based on business logic, explained Pasztor. This allows it to create relevant data sets in days rather than months or years, he asserted.

“Engines like GPT or Dali can generate models, but you need a quality assurance [QA] layer like DiffuseDrive,” he said. “The QA layer is built on the application or use case from aerospace, etc., and the reasoning model understands what has already been presented.”

DiffuseDrive uses both classical and new methods of statistical analysis to contextually understand existing data and build out data points, similar to a point cloud, Pasztor said.

“We use a separate system to understand what clients already have, essentially building a decision tree,” he said. “For example, for Level 2 autonomous driving, we built a heat map of parking scenarios and object location distribution. DiffuseDrive then identified that it was missing large and close items at certain times. By getting to a wider distribution of data, we improved performance by 40%.”

Customers control the ODD data

At the same time, DiffuseDrive does not develop domain expertise. Instead, the company digests its customers’ documentation and real-world operational design domain (ODD) data.

“They’re the domain experts and are in control of in terms of generating their requirements,” said Pasztor. “They don’t want anyone to take over their jobs but want us to augment them.”

Once it has the basic data, DiffuseDrive uses semantic segmentation, contextual and visual labeling, as well as 2D and 3D bounding boxes. “Every time they generate images, the data-point map fills up, not just filling gaps but also expanding ODD knowledge,” Pasztor said.

Graphic explaining that customers control their data for faster time to market, says DiffuseDrive.

Customers control their domain data, which is then rapidly analyzed for gaps. Source: DiffuseDrive.

DiffuseDrive sees market opportunities

The global market for AI in robotics could experience a compound annual growth rate of 38.5%, expanding from $12.77 billion in 2023 to $124.77 billion by 2030, according to Grand View Research.

“Our vision is to eventually have every autonomous system use DiffuseDrive data — it could be an enterprise or an individual’s project,” said Pasztor. “We decided to build on our experience with cars and drones, since autonomous vehicles still need a lot of data, and most companies don’t have the scale of Tesla.”

DiffuseDrive is onboarding its third wave of customers, following drone pilots and then autonomous driving and security monitoring. They include AISIN, Continental, and Denso. The company said it also sees potential in defense, warehousing, construction, and agriculture.

“At CVPR, we spoke with 50 potential customers from the Fortune 500, several of which are producing not only autonomous systems but also stationary ones like industrial robots,” Pasztor said. “Healthcare people were also interested in closing the data loop.”

In May, DiffuseDrive raised $3.5 million in seed funding, adding to $1 million it previously received from E2VC. It also appointed Jordan Kretchmer, a senior partner at Outlander VC and co-founder of Rapid Robotics Inc., to its board.

“Jordan has experience in robotics investment, and our thesis is to be industry-agnostic, from manufacturing applications like QA all the way to household picking robots,” Pasztor said. “Realistic imagery should spread quickly between different verticals, as we’re learning from everyone. The differentiator is not the synthetic data anymore; its creating the data engine.”

As my co-founder says, ‘Software is developed iteratively, so why isn’t data,” he concluded.


SITE AD for the 2026 Robotics Summit save the date.

About The Author

Eugene Demaitre

Eugene Demaitre is editorial director of the robotics group at WTWH Media. He was senior editor of The Robot Report from 2019 to 2020 and editorial director of Robotics 24/7 from 2020 to 2023. Prior to working at WTWH Media, Demaitre was an editor at BNA (now part of Bloomberg), Computerworld, TechTarget, and Robotics Business Review.

Demaitre has participated in robotics webcasts, podcasts, and conferences worldwide. He has a master's from the George Washington University and lives in the Boston area.

Tell Us What You Think! Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles Read More >

A robotic arm builds a lattice-like stool after hearing the prompt ā€œI want a simple stool,ā€ demonstrating how the system translates speech into real-time fabrication.
With AI, MIT researchers teach a robot to build furniture by just asking
Jeff Burnstein, president of A3, introduced the panel discussion, which included, from left, Boston Dynamics' Brendan Schulman, Path Robotics' Heather Carroll, Intrinsic's Torsten Kroger, LCCC's Terri Santu, and MCCCT's Jason Moore and Matt Peters.
A national robotics strategy is necessary to reshore manufacturing, says the Congressional Robotics Caucus
A rendering of a car moving through a busy street.
Helm.ai releases new architectural framework for autonomous vehicles
Inbolt is helping manufacturers such as Stellantis with vision-guided robots like this one.
Inbolt provides vision guidance in real time for new bin-picking system

RBR50 Innovation Awards

ā€œrr
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for Robotics Professionals.

Latest Episode of The Robot Report Podcast

Automated Warehouse Research Reports

Sponsored Content

  • Supporting the future of medical robotics with smarter motor solutions
  • YUAN Unveils Next-Gen AI Robotics Powered by NVIDIA for Land, Sea & Air
  • ASMPT chooses Renishaw for high-quality motion control
  • Revolutionizing Manufacturing with Smart Factories
  • How to Set Up a Planetary Gear Motion with SOLIDWORKS
The Robot Report
  • Automated Warehouse
  • RoboBusiness Event
  • Robotics Summit & Expo
  • About The Robot Report
  • Subscribe
  • Contact Us

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search The Robot Report

  • Home
  • News
  • Technologies
    • Batteries / Power Supplies
    • Cameras / Imaging / Vision
    • Controllers
    • End Effectors
    • Microprocessors / SoCs
    • Motion Control
    • Sensors
    • Soft Robotics
    • Software / Simulation
  • Development
    • Artificial Intelligence
    • Human Robot Interaction / Haptics
    • Mobility / Navigation
    • Research
  • Robots
    • AGVs
    • AMRs
    • Consumer
    • Collaborative Robots
    • Drones
    • Humanoids
    • Industrial
    • Self-Driving Vehicles
    • Unmanned Maritime Systems
  • Business
    • Financial
      • Investments
      • Mergers & Acquisitions
      • Earnings
    • Markets
      • Agriculture
      • Healthcare
      • Logistics
      • Manufacturing
      • Mining
      • Security
    • RBR50
      • RBR50 Winners 2025
      • RBR50 Winners 2024
      • RBR50 Winners 2023
      • RBR50 Winners 2022
      • RBR50 Winners 2021
  • Resources
    • Automated Warehouse Research Reports
    • Digital Issues
    • eBooks
    • Publications
      • Automated Warehouse
      • Collaborative Robotics Trends
    • Search Robotics Database
    • Videos
    • Webinars / Digital Events
  • Events
    • RoboBusiness
    • Robotics Summit & Expo
    • DeviceTalks
    • R&D 100
    • Robotics Weeks
  • Podcast
    • Episodes
  • Advertise
  • Subscribe