Researchers have developed software that helps computers understand human body movements and poses.
Register for full access
Our library content is no longer freely available. Please register to gain access to more than 12,000 innovations, updated daily. Our content is global in scope and covers solutions to the world's biggest challenges across 18 sectors.
We have already seen artificial intelligence software that can make critical decisions and direct drones to prevent poaching. At Carnegie Mellon University (CMU), robotics researchers have now developed a method that enables a computer to understand the body poses and movements of multiple people at the same time, including the positions of individual fingers. Yaser Sheikh, associate professor of robotics, has pointed out that this technique allows computers to track the nuances of human non-verbal communication and could open up new ways for people and machines to interact with each other.
Tracking multiple people in real time, particularly in social situations where one or more people may be in physical contact, presented a number of challenges. For body parts, such as arms, legs, faces, Sheikh and his team first identified all the body parts in each scene, and then associated those parts with particular individuals. Fingers, however, represented an even greater challenge. Since people use their hands to hold objects and make gestures, cameras have a hard time seeing all the parts of a hand at once. To get around this, researchers used CMU’s multi-camera Panoptic Studio, which is a two-story dome fitted with 500 video cameras. The studio is capable of providing 500 views of a person’s hand from a single shot, as well as automatically annotating the hand positions.
Using the photos, the team were able to build a dataset and software that could detect the pose of a group of people using just a single camera and a laptop computer. To encourage more research in this area, the team have made their data and computer code available to other research and commercial groups. Shiekh says that possible applications include allowing people to communicate with computers by pointing; enabling self-driving software to ‘learn’ when a pedestrian is about to step into the street by monitoring body language; and enabling new approaches to behavioural diagnosis and rehabilitation for conditions such as autism, dyslexia and depression. What other uses could there be for software that can read and understand human movements and poses?