Machine learning guarantees robots’ performance in unknown territory

Summary: In the present world where we see engineers use and employ machine learning systems and methods for the development of robots that are adaptable, there is new work that is in progress about the robot’s performance and safety guarantees. This new work is especially for such robots that will have to operate in a new and unknown environment and also interact with various different constraints and obstacles.

Full Story:

Experiment- In a space that is full of cardboard cylinders placed randomly all over a drone takes its test flight. The drone’s algorithm which controls it is fully trained and expert in a thousand of different courses that are laden with obstacles. Still this space and the obstacles it offers are new to the drone. We see that nine out of ten times the drone is able to dodge all those cardboard cylinders that come in its path.

PAC-Bayes control for obstacle avoidance with Parrot SWING
Source: Princeton University Researchers

The above experiment is important because it proves that modern robots and robotics is capable of having guaranteed success and safety even while operating in a new and unknown environment. In the present world where we see engineers use and employ machine learning systems and methods for the development of robots that are adaptable, there is new work by Princeton University that is in progress about the robot’s performance and safety guarantees. This new work is especially for such robots that will have to operate in a new and unknown environment and also interact with various different constraints and obstacles.

An assistant professor of mechanical and aerospace engineering at Princeton University, Anirudha Majumdar said, ‘over the last decade or so, there’s been a tremendous amount of excitement and progress around machine learning in the terms of robotics, primarily due to which it allows you to handle rich sensory inputs, like those from a robot’s camera, and map these complex inputs to actions.’

It is to be noted however that the algorithms of robot control which are powered by machine learning do run a risk which could mean the overfitting to what their training data initially was. This is a situation in which algorithms become way less effective when they have to interact with new unknown situations. The Intelligent Robot Motion Lab of Majumdar expands the selection of the tools available for the training of robot and robot control policies, to address such situations. They also quantify the expected safety and success of robots when they have to perform and work in novel and unknown environments.

The researchers used machine learning frameworks from various other arenas and fields in their research on robot manipulation and locomotion. They have stated their finding in three papers. Majumdar said the whole new methods are among the first to apply generalization theory to the more complex task of making guarantees on robots’ performance in unfamiliar settings. Whereas the other approaches have given such guarantees under more restrictive assumptions, the team’s methods offer more broadly applicable guarantees on performance in novel environments.

The first paper provides proof of the principle which applies machine learning systems and frameworks. The research team at Princeton tested their ideas in a number of different simulations. The team also provided validation to their technique by keenly assessing how a small drone called Parrot Swing was able to avoid obstacles in a novel space with a guaranteed success rate of the drone’s control policy at 88.4%, and avoiding of obstacles in 18 of 20 trials (90%).

Anoop Kumar Sonar, a computer science concentrator and Alec Farid, a graduate student in mechanical and aerospace engineering co-authored this paper with Majumdar. Their work was published in the International Journal of Robotics Research on 3rd October 2020.

Farid about this new work said, when applying machine learning techniques from other areas to robotics there are a lot of special assumptions you need to satisfy, and one of them is saying how similar the environments you’re expecting to see are to the environments your policy was trained on. In addition to showing that we can do this in the robotic setting, we also focused on trying to expand the types of environments that we could provide a guarantee for.”

Majumdar adds, the kinds of guarantees we’re able to give a range from about 80% to 95% success rates on new environments, depending on the specific task, but if you’re deploying [an unmanned aerial vehicle] in a real environment, then 95% probably isn’t good enough. I see that as one of the biggest challenges, and one that we are actively working on.

A senior research scientist at the Toyota Research Institute in Los Altos, California, Hongkai Dai said, these guarantees are paramount to many safety-critical applications, such as self-driving cars and autonomous drones, where the training set cannot cover every possible scenario. The guarantee tells us how likely it is that a policy can still perform reasonably well on unseen cases, and hence establishes confidence in the policy, where the stake of failure is too high.

The other two papers were presented on November 18th at the Virtual Conference on Robot Learning and there the researchers looked at the refinements that can be additionally done to get them closer to robot control guarantees and policies that will make then safe enough to be employed in the real world. Imitation learning was used in one of the papers. Here a human who is an “expert” guided the robot manually to dodge obstacles; move and pick up new and unknown things and thus provided the robot with the data required for its training.

Allen Ren, the lead author and graduate student in mechanical and aerospace engineering made use of a 3D mouse of a computer and thus controlled a robotic arm training it in grasping and picking up different sized and shaped mugs.

This arm was able to grasp at the rims of 25 various mugs and thus pick them up with its two grippers that were meant to act like fingers. This achieved a 93% success in the easier tasks and on harder ones it had 80% success.

Ren said, “We have a camera on top of the table that sees the environment and takes a picture five times per second. Our policy training simulation takes this image and outputs what kind of action the robot should take, and then we have a controller that moves the arm to the desired locations based on the output of the model.”

The third and the last of those papers showed how vision-based planners were developed that guarantees the presence of robots that can fly or walk and carry out certain planned tasks in different environments. Sushant Veer, a postdoctoral research associate in mechanical and aerospace engineering and also the lead author said, “That required coming up with some new algorithmic tools for being able to tackle that dimensionality and still be able to give strong generalization guarantees.”

Generalization Guarantees for Multi-Modal Imitation Learning
Source: Intelligent Robot Motion Lab

Majumdar and Veer both made an evaluation of the vision-based planners through a drone that dodged obstacles and moved around and through a robot with four legs that moved along very rough terrains which had some slopes as high as at a 35-degree angle. This four-legged robot had 80% success in unknown environments. The scientists and researchers are busy trying to improve the safety and success guarantees of their policies.

Story Source:

Materials provided by Princeton University, Engineering School.

Journal Reference:

Anirudha Majumdar, Alec Farid, Anoopkumar Sonar. PAC-Bayes control: learning policies that provably generalize to novel environments. The International Journal of Robotics Research, 2020; 027836492095944 DOI: 10.1177/0278364920959444

TechThoroughFare

Share on facebook
Share on twitter
Share on linkedin

Leave a Reply

Your email address will not be published.

Science

Latest

Trending