Even before social distancing forced students to stay home around the world, parents, teachers, and therapists struggled to give enough attention to children with special needs. Last week, Embodied Inc. launched Moxie, a social robot designed to help children with cognitive development. Moxie uses machine learning and the SocialX platform to perceive and interact.
Maja Matarić, interim vice president and vice dean for research at the University of Southern California’s Viterbi School of Engineering, co-founded Embodied in 2016. The Pasadena, Calif.-based company said it “has assembled a world-class team of experts in child development, engineering, technology, game design, and entertainment to create Moxie.” Embodied has worked with advisors from Disney, MIT, Pixar, and The Jim Henson Co., among others.
In addition, Embodied has worked with Yves Béhar’s fuseproject design firm, and it raised a total of $11.7 million in equity plus $4 million in venture debt in March 2020. The company should not be confused with Covariant (formerly Embodied Intelligence), and its product is different than Diligent Robotics Inc.’s Moxi hospital robot.
“As an early investor in Embodied, it’s exciting to see how the company is leveraging machine learning to create innovations that enhance our daily lives,” said Wendell Brooks, Intel senior vice president and president of Intel Capital. “Moxie is a true reflection of that.” Other investors include Toyota AI Ventures, Amazon Alexa Fund, Sony Innovation Fund, and Grishin Robotics.
Identifying where to improve human-machine interaction
“At Embodied, we have been rethinking and reinventing how human-machine interaction is done beyond simple verbal commands, to enable the next generation of computing, and to power a new class of machines capable of fluid social interaction,” stated Paolo Pirjanian, co-founder and CEO of Embodied and former chief technology officer of iRobot Corp. “Moxie is a new type of robot that has the ability to understand and express emotions with emotive speech, believable facial expressions and body language, tapping into human psychology and neurology to create deeper bonds.”
“When I was in my mid-teens, I was interested in going after the medical field, but then, by accident, I got a computer,” he told The Robot Report. “I slowly learned how to code, and later got a computer science degree and then a Ph.D. in robotics in Denmark. I worked at JPL [NASA’s Jet Propulsion Laboratory] on Mars rovers and was enticed to join a robotics startup whose core technology spun out in 2001 as VSLAM [Visual Simultaneous Localization and Mapping].”
“All of iRobot’s products use VSLAM, which is known to be the world’s most cost-effective solution for visual navigation,” Pirjanian said. “I spent a year researching into areas where human-robot interaction could help people, and one obvious target is the elderly population, which is socially isolated. Intuition Robotics has a good team working to serve that population.”
“At the other end of the spectrum are children who receive clinical therapy for AD/HD [Attention-Deficit/Hyperactivity Disorder], anxiety, or autism,” he added. “There’s a big need, the cost of therapy is super-high, and there are waiting lists. In talking with families, I also learned that it was not just children with developmental difficulties, but there was also a lot of interest in helping every child, even ‘neurotypical’ children.”
“In the past five-plus years, studies have found that emotional, or ‘EQ’ skills, are just as important as IQ skills,” explained Pirjanian. “There are lots of tools for STEM [science, technology, engineering, and mathematics] education, but much for EQ skills.”
“A number of other studies have shown how excessive screen time is causing anxiety, and the Centers for Disease Control notes that suicide rates are at record highs,” he said. “We decided to develop a product to help children with social, cognitive, and emotional interactions.”
Developing Moxie to be different
“Embodied has been working on Moxie for four years in stealth,” said Pirjanian. “We have top technologists in computer vision and conversational AI. We had to push the frontier of the state of the art — a lot of social robots try to adopt what has already been successful for Amazon Alexa or Google Home, which is not really a social interaction. It’s more transactional. Kids who have Alexa can become rude because they’re used to barking commands.”
“Many robotics companies get started with a naive attitude about what it takes to be a successful company. They bring a technology to a market that isn’t looking for a technology but is looking for solutions,” he said. “Only a few robotics companies have been able to offer a clear value proposition, such as iRobot or Kiva [now Amazon Robotics].”
“We decided to design our hardware and software with a clear purpose — to help enhance human capabilities in a safe environment,” Pirjanian said. “In 2018, the FDA granted its first approval for therapeutic software that can be prescribed. We’re not saying that we’re going to provide a replacement for therapy, but during testing, parents have already been surprised at how the robot can help their children.”
“A third challenge for social robots, besides a transactional interface and a lack of purpose, is having enough interesting content,” said Pirjanian. “Every week, Moxie has a theme to work on certain aspects of social, cognitive, or emotional development. It takes a few weeks to get to know a child through a process of introducing the child to its interfaces and some global commands.”
“Craig Allen, our chief creative officer, worked at Disney and Jim Henson Interactive. We’re making sure content is tested, curated, and approved by child-development experts on staff applying evidence-based techniques,” he said. “We’ve also included the Encyclopaedia Britannica Merriam Webster Dictionary for Children.”
“The child can get out of a program and do something else, i.e., ‘What do you want to do?’ or repeat itself if they didn’t hear,” he said. “This allows it to recover from bad interactions.”
“Since the beginning of computing, we have adapted ourselves to keyboards, but now, computers can adapt to us,” Pirjanian said. “Our SocialX platform for interaction could find its way into devices beyond robots. The robot needs to volley in conversation and know when you’re done. Long pauses can be awkward, and the robot has to know without a wake word when to jump in.”
“Another challenge is the ‘cocktail party problem’ — humans can filter out multiple conversations,” he said. “With multimodal fusion, we can use vision and the direction of audio to focus. In navigation, it’s the equivalent of the hard ‘robot kidnapping’ problem, where a blindfolded robot is moved. VSLAM solves that.”
“It’s not just natural language processing. A child won’t follow a strictly prescribed path — ‘How was your day?'” he said. “The robot has to be able to parse through intent and expression and provide an appropriate response.”
Moxie’s monthly content modules focus on simple themes, such as managing emotions and being kind to others. “Our team of 20 has a lot of expertise in empathy, mindfulness exercises, and natural interactions,” said Pirjanian. “Moxie can tell jokes and ask a child to read together, and children can learn about the Global Robotics Laboratory, a portal we’ve designed for them to track their accomplishments.”
Embodied builds intelligence into Moxie
“We were originally going to use off-the-shelf components to control costs, but we realized there weren’t any that did what we needed, so we had to grow our own,” Pirjanian said. “We made some risky choices to provide real solutions rather than a shortcut to a robot that doesn’t serve people’s needs.”
“Eyes are super-important for creating rapport,” he said. “Most social robotics companies stick a screen on their robots. We put a significant amount of work into a curved screen, which is important for embodiment.”
“Expressions and body language have to look right, so we put arms on Moxie, not for manipulation, but for gesturing,” said Pirjanian. “They’re removable and are better than a robot with just a tablet in its chest.”
Embodied has been testing Moxie with about 100 families in Los Angeles and Pittsburgh for the past 12 months. “We expected an average of 15 minutes of interaction per day, but we’ve seen a lot more than that,” Pirjanan said. “Moxie has already crossed the barrier of most social robots at close to one hour per day for engagements of 10 days. We have three to four months of content at launch.”
Unlike other social robots, Embodied is relying less on the cloud and more on onboard computing. It is carefully gathering data to help tune Moxie’s artificial intelligence.
“For privacy and security, we’re doing everything on board the robot,” said Pirjanian. “Everything but ASR, or automatic speech recognition, which uses Google, runs onboard. For security and responsiveness, 95% to 99% of Moxie’s programming is running onboard.”
“Moxie uses machine learning to recognize facial features, build models for conversations, and even adapt interactions to individual children,” he said. “When we collect data to train algorithms, it’s stripped of identifying information. We have some automated techniques for slicing and dicing data to understand what kinds of interactions improve for what type of child and how the content modules could apply to similar children.”
Moxie’s interactions with children in their homes could help both parents and researchers. “The child’s reactions can be shared with parents in a simple app dashboard. They can click and go deeper and choose to share this data with other caregivers, therapists, or teachers.” said Pirjanan. “Clinicians are thirsting for additional data, so having a robot deployed in more than 1,000 homes would give amazing insights.”
As of now, Embodied is developing Moxie’s skills in-house, but it might eventually release a software developer’s kit for non-therapeutic applications, such as entertainment, Pirjanian said.
Availability and price point
Moxie is now available for pre-orders for $50 down in the U.S. Its pricing includes the device and a year’s subscription to content including the monthly Moxie Mission Packs, the Global Robotics Laboratory, and behavioral analytics in the Embodied Moxie parent app.
“It will be like owning a smartphone at about $1,500 for the robot plus $60 per month in services, including continuous updates, monthly mailings of sticker books to unlock content, and online games,” Pirjanian said. “We’ve tried very hard to keep costs down, which is one reason we chose a tabletop model rather than a mobile one.”
“Jibo used an NVIDIA Tegra processor, which cost $120. That alone translated to $600 retail,” he said. “Our processor is a $10 off-the-shelf quad-core ARM, a fraction of the size.”
Embodied today closed its Moxie Pioneer Mentor Program, which enables some families to beta-test an early version of the social robot for free, because of an “overwhelming” number of applications.
The COVID-19 pandemic caused some delays in Embodied’s supply chain, but the company still expects to start shipping Moxie in September.