Perception Arises with Locomotion

Hand-Eye Coordination and Computer Vision

For humans, it goes without saying that vision is extremely valuable. When you stop to think about it, it's remarkable what a diverse set of capabilities is enabled by human vision - from reading facial expressions, to navigating complex three-dimensional spaces (whether by foot, bicycle, car, or otherwise), to performing intricate tasks like threading a needle.

Bild: Embedded Vision AllianceBild: Embedded Vision Alliance

One of the reasons why I'm so excited about the potential of computer vision is that I believe that it will bring a similar range of diverse and valuable capabilities to many types of devices and systems. In the past, computer vision required too much computation to be deployed widely. But today, sufficient processing power is available at cost and power consumption levels suitable for high-volume products. As a result, computer vision is proliferating into thousands of products. The vast range of diverse capabilities enabled by vision (from user interfaces to video summarization to navigation, for example), coupled with the wide range of potential applications, can be daunting. How do we figure out which of these capabilities and applications are really worthwhile, and which are mere novelties?

I think the analogy with biological vision can help. In a recent lecture, U.C. Berkeley professor Jitendra Malik pointed out that in biological evolution, "perception arises with locomotion." In other words, organisms that spend their lives in one spot have little use for vision. But when an organism can move, vision becomes very valuable - enabling the organism to seek food and mates, for example, and to avoid becoming food for other creatures. In the technological world, to paraphrase Professor Malik, when you put vision and locomotion together, you get things like self-driving cars. And vacuum cleaning robots, obstacle-avoiding drones, driverless forklifts, etc. It's possible to build autonomous, mobile devices like these without vision, but it rarely makes sense to do so. In other words, just as in the biological world, vision becomes essential when we create devices that move about.

What other clues can we glean from biology to inform our thinking about the most valuable uses of computer vision? In his lecture, Professor Malik pointed out that in biological evolution, "the development of the hand led to the development of the brain." While feet carry us from place to place, hands are arguably the main means by which humans act on the physical world. Human hands are extraordinarily versatile - and vision is essential to realizing their potential. Similarly, machines that act on the physical world require visual perception to realize their full potential. For years, this has been evident through research projects showing that vision-enabled robots can do amazing things, from the robot that always wins at Rock, Papers, Scissors to robots that learn how to grasp new object through experimentation. What's exciting now is that robots that use vision to act on the physical world are being deployed at scale, from tiny interactive toys to large agricultural machines. Of course, not all of these robots have what we would think of as "hands"; depending on the tasks they're designed for, other types of manipulators may be appropriate. In his lecture, Professor Malik quoted the Greek philosopher Anaxagoras, who said: "It is because of being armed with hands that man is the most intelligent animal." Similarly, as machines gain the ability to interact with the physical world, they need intelligence - especially visual intelligence - to become truly capable.

If you want to understand how computer vision is changing industries and business models, and learn about the latest practical techniques and technologies for adding vision to all types of systems, I invite you to join me, Mark Bünger, and over 40 other speakers at the Embedded Vision Summit, taking place May 1-3, 2017 in Santa Clara, California. For details about this unique conference, and to register, please visit

Embedded Vision Alliance

Dieser Artikel erschien in inVISION 1 2017 - 08.03.17.
Für weitere Artikel besuchen Sie