#Chatbots

The AI revolution is coming to robots: how will it change them? – Nature

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
PubMed  Google Scholar
Humanoid robots developed by the US company Figure use OpenAI programming for language and vision. Credit: AP Photo/Jae C. Hong/Alamy
For a generation of scientists raised watching Star Wars, there’s a disappointing lack of C-3PO-like droids wandering around our cities and homes. Where are the humanoid robots fuelled with common sense that can help around the house and workplace?
Rapid advances in artificial intelligence (AI) might be set to fill that hole. “I wouldn’t be surprised if we are the last generation for which those sci-fi scenes are not a reality,” says Alexander Khazatsky, a machine-learning and robotics researcher at Stanford University in California.
From OpenAI to Google DeepMind, almost every big technology firm with AI expertise is now working on bringing the versatile learning algorithms that power chatbots, known as foundation models, to robotics. The idea is to imbue robots with common-sense knowledge, letting them tackle a wide range of tasks. Many researchers think that robots could become really good, really fast. “We believe we are at the point of a step change in robotics,” says Gerard Andrews, a marketing manager focused on robotics at technology company Nvidia in Santa Clara, California, which in March launched a general-purpose AI model designed for humanoid robots.
At the same time, robots could help to improve AI. Many researchers hope that bringing an embodied experience to AI training could take them closer to the dream of ‘artificial general intelligence’ — AI that has human-like cognitive abilities across any task. “The last step to true intelligence has to be physical intelligence,” says Akshara Rai, an AI researcher at Meta in Menlo Park, California.
But although many researchers are excited about the latest injection of AI into robotics, they also caution that some of the more impressive demonstrations are just that — demonstrations, often by companies that are eager to generate buzz. It can be a long road from demonstration to deployment, says Rodney Brooks, a roboticist at the Massachusetts Institute of Technology in Cambridge, whose company iRobot invented the Roomba autonomous vacuum cleaner.
There are plenty of hurdles on this road, including scraping together enough of the right data for robots to learn from, dealing with temperamental hardware and tackling concerns about safety. Foundation models for robotics “should be explored”, says Harold Soh, a specialist in human–robot interactions at the National University of Singapore. But he is sceptical, he says, that this strategy will lead to the revolution in robotics that some researchers predict.
The term robot covers a wide range of automated devices, from the robotic arms widely used in manufacturing, to self-driving cars and drones used in warfare and rescue missions. Most incorporate some sort of AI — to recognize objects, for example. But they are also programmed to carry out specific tasks, work in particular environments or rely on some level of human supervision, says Joyce Sidopoulos, co-founder of MassRobotics, an innovation hub for robotics companies in Boston, Massachusetts. Even Atlas — a robot made by Boston Dynamics, a robotics company in Waltham, Massachusetts, which famously showed off its parkour skills in 2018 — works by carefully mapping its environment and choosing the best actions to execute from a library of built-in templates.
For most AI researchers branching into robotics, the goal is to create something much more autonomous and adaptable across a wider range of circumstances. This might start with robot arms that can ‘pick and place’ any factory product, but evolve into humanoid robots that provide company and support for older people, for example. “There are so many applications,” says Sidopoulos.

The human form is complicated and not always optimized for specific physical tasks, but it has the huge benefit of being perfectly suited to the world that people have built. A human-shaped robot would be able to physically interact with the world in much the same way that a person does.
However, controlling any robot — let alone a human-shaped one — is incredibly hard. Apparently simple tasks, such as opening a door, are actually hugely complex, requiring a robot to understand how different door mechanisms work, how much force to apply to a handle and how to maintain balance while doing so. The real world is extremely varied and constantly changing.
The approach now gathering steam is to control a robot using the same type of AI foundation models that power image generators and chatbots such as ChatGPT. These models use brain-inspired neural networks to learn from huge swathes of generic data. They build associations between elements of their training data and, when asked for an output, tap these connections to generate appropriate words or images, often with uncannily good results.
Likewise, a robot foundation model is trained on text and images from the Internet, providing it with information about the nature of various objects and their contexts. It also learns from examples of robotic operations. It can be trained, for example, on videos of robot trial and error, or videos of robots that are being remotely operated by humans, alongside the instructions that pair with those actions. A trained robot foundation model can then observe a scenario and use its learnt associations to predict what action will lead to the best outcome.
Google DeepMind has built one of the most advanced robotic foundation models, known as Robotic Transformer 2 (RT-2), that can operate a mobile robot arm built by its sister company Everyday Robots in Mountain View, California. Like other robotic foundation models, it was trained on both the Internet and videos of robotic operation. Thanks to the online training, RT-2 can follow instructions even when those commands go beyond what the robot has seen another robot do before1. For example, it can move a drink can onto a picture of Taylor Swift when asked to do so — even though Swift’s image was not in any of the 130,000 demonstrations that RT-2 had been trained on.
In other words, knowledge gleaned from Internet trawling (such as what the singer Taylor Swift looks like) is being carried over into the robot’s actions. “A lot of Internet concepts just transfer,” says Keerthana Gopalakrishnan, an AI and robotics researcher at Google DeepMind in San Francisco, California. This radically reduces the amount of physical data that a robot needs to have absorbed to cope in different situations, she says.
But to fully understand the basics of movements and their consequences, robots still need to learn from lots of physical data. And therein lies a problem.
Although chatbots are being trained on billions of words from the Internet, there is no equivalently large data set for robotic activity. This lack of data has left robotics “in the dust”, says Khazatsky.
Pooling data is one way around this. Khazatsky and his colleagues have created DROID2, an open-source data set that brings together around 350 hours of video data from one type of robot arm (the Franka Panda 7DoF robot arm, built by Franka Robotics in Munich, Germany), as it was being remotely operated by people in 18 laboratories around the world. The robot-eye-view camera has recorded visual data in hundreds of environments, including bathrooms, laundry rooms, bedrooms and kitchens. This diversity helps robots to perform well on tasks with previously unencountered elements, says Khazatsky.
When prompted to ‘pick up extinct animal’, Google’s RT-2 model selects the dinosaur figurine from a crowded table.Credit: Google DeepMind
Gopalakrishnan is part of a collaboration of more than a dozen academic labs that is also bringing together robotic data, in its case from a diversity of robot forms, from single arms to quadrupeds. The collaborators’ theory is that learning about the physical world in one robot body should help an AI to operate another — in the same way that learning in English can help a language model to generate Chinese, because the underlying concepts about the world that the words describe are the same. This seems to work. The collaboration’s resulting foundation model, called RT-X, which was released in October 20233, performed better on real-world tasks than did models the researchers trained on one robot architecture.
Many researchers say that having this kind of diversity is essential. “We believe that a true robotics foundation model should not be tied to only one embodiment,” says Peter Chen, an AI researcher and co-founder of Covariant, an AI firm in Emeryville, California.
Covariant is also working hard on scaling up robot data. The company, which was set up in part by former OpenAI researchers, began collecting data in 2018 from 30 variations of robot arms in warehouses across the world, which all run using Covariant software. Covariant’s Robotics Foundation Model 1 (RFM-1) goes beyond collecting video data to encompass sensor readings, such as how much weight was lifted or force applied. This kind of data should help a robot to perform tasks such as manipulating a squishy object, says Gopalakrishnan — in theory, helping a robot to know, for example, how not to bruise a banana.
Covariant has built up a proprietary database that includes hundreds of billions of ‘tokens’ — units of real-world robotic information — which Chen says is roughly on a par with the scale of data that trained GPT-3, the 2020 version of OpenAI’s large language model. “We have way more real-world data than other people, because that’s what we have been focused on,” Chen says. RFM-1 is poised to roll out soon, says Chen, and should allow operators of robots running Covariant’s software to type or speak general instructions, such as “pick up apples from the bin”.
Another way to access large databases of movement is to focus on a humanoid robot form so that an AI can learn by watching videos of people — of which there are billions online. Nvidia’s Project GR00T foundation model, for example, is ingesting videos of people performing tasks, says Andrews. Although copying humans has huge potential for boosting robot skills, doing so well is hard, says Gopalakrishnan. For example, robot videos generally come with data about context and commands — the same isn’t true for human videos, she says.
A final and promising way to find limitless supplies of physical data, researchers say, is through simulation. Many roboticists are working on building 3D virtual-reality environments, the physics of which mimic the real world, and then wiring those up to a robotic brain for training. Simulators can churn out huge quantities of data and allow humans and robots to interact virtually, without risk, in rare or dangerous situations, all without wearing out the mechanics. “If you had to get a farm of robotic hands and exercise them until they achieve [a high] level of dexterity, you will blow the motors,” says Nvidia’s Andrews.
But making a good simulator is a difficult task. “Simulators have good physics, but not perfect physics, and making diverse simulated environments is almost as hard as just collecting diverse data,” says Khazatsky.
Meta and Nvidia are both betting big on simulation to scale up robot data, and have built sophisticated simulated worlds: Habitat from Meta and Isaac Sim from Nvidia. In them, robots gain the equivalent of years of experience in a few hours, and, in trials, they then successfully apply what they have learnt to situations they have never encountered in the real world. “Simulation is an extremely powerful but underrated tool in robotics, and I am excited to see it gaining momentum,” says Rai.
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
Nature 630, 22-24 (2024)
doi: https://doi.org/10.1038/d41586-024-01442-5
Correction 31 May 2024: An earlier version of this feature gave the wrong name for Nvidia’s simulated world.
Brohan, A. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2307.15818 (2023).
Khazatsky, A. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2403.12945 (2024).
Open X-Embodiment Collaboration et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2310.08864 (2023).
Download references
Reprints and permissions
Are robots the solution to the crisis in older-person care?
How to 3D print fully formed robots
Swift progress for robots over complex terrain
A guide to the Nature Index
Nature Index
Will Gates and other funders save massive public health database at risk from Trump cuts?
News
AI linked to explosion of low-quality biomedical research papers
News
Millions of tonnes of nanoplastics are polluting the ocean
News Explainer
Will algorithms choose your next lab colleague?
Technology Feature
Inventions that made the United States a powerhouse of innovation
News & Views
Scientists hide messages in papers to game AI peer review
News
Will algorithms choose your next lab colleague?
Technology Feature
Will AI speed up literature reviews or derail them entirely?
Comment
At all ranks who focus on research in Artificial Intelligence, Data Science, and Machine Learning in Health Care and Medical Sciences.
Beijing (CN)
The Chinese Institutes for Medical Research (CIMR), Beijing
The Anandasabapathy lab is seeking an organized, motivated, creative and enthusiastic scientist with excellent interpersonal and communication skills.
New York City, New York (US)
Weill Cornell Medicine (Dermatology)
Biogen is seeking a Sr Clinical Data Manager to support the Clinical Data Management team.
San Francisco, California
Biogen
Biogen is seeking an Associate Director, Epidemiology position is to provide epidemiologic expertise.
Cambridge, Massachusetts (US)
Biogen
A postdoctoral IRTA position on pathogenesis and treatment of emerging viruses is available in the Emerging Pathogens Section within the Laboratory…
NIH Rocky Mountain Laboratories in Hamilton, Montana, USA
NIH Chertow lab
Are robots the solution to the crisis in older-person care?
How to 3D print fully formed robots
Swift progress for robots over complex terrain
An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
Nature (Nature)
ISSN 1476-4687 (online)
ISSN 0028-0836 (print)
© 2025 Springer Nature Limited

source

The AI revolution is coming to robots: how will it change them? – Nature

Elon Musk’s AI Was Ordered to Be

The AI revolution is coming to robots: how will it change them? – Nature

5 Things You Must Not Share With