When humans look at a scene, they see objects and the relationships between them. On top of your desk, there might be a laptop that is sitting to the left of a phone, which is in front of a computer monitor.
Many deep learning models struggle to see the world this way because they don’t understand the entangled relationships between individual objects. Without knowledge of these relationships, a robot designed to help someone in a kitchen would have difficulty following a command like “pick up the spatula that is to the left of the stove and place it on top of the cutting board.”
In an effort to solve this problem, MIT researchers have developed a model that understands the underlying relationships between objects in a scene. Their model represents individual relationships one at a time, then combines these representations to describe the overall scene. This enables the model to generate more accurate images from text descriptions, even when the scene includes several objects that are arranged in different relationships with one another.
This work could be applied in situations where industrial robots must perform intricate, multistep manipulation tasks, like stacking items in a warehouse or assembling appliances. It also moves the field one step closer to enabling machines that can learn from and interact with their environments more like humans do.
“When I look at a table, I can’t say that there is an object at XYZ location. Our minds don’t work like that. In our minds, when we understand a scene, we really understand it based on the relationships between the objects. We think that by building a system that can understand the relationships between objects, we could use that system to more effectively manipulate and change our environments,” says Yilun Du, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.
Du wrote the paper with co-lead
This article is trimmed, please visit the source to read the full article.
The post Artificial intelligence that understands object relationships appeared first on MIT News | Massachusetts Institute of Technology.