As you may have seen Google is working on a cool project called Tango which allows smartphones and tablets (and possibly Google Glass down the road) to not only see, but understand your environment. This is made possible by computer vision techniques not unlike what is used in Kinect 2 today. To make that possible at a scale and power consumption that is compatible with the battery capacity of a phone, Google has partnered with Movidius, a semiconductor company that builds Myriad 1, a vision processor that was designed from the ground up for this task.
The end game: give “human-like” vision to your handset
The Myriad 1 Movidius chip is able to watch a scene in real-time and perform or accelerate tasks like extracting corners, detect and track feature points, match feature points, build depth maps (from IR or stereo cameras), stitching and many more things that are critical to computer vision or virtual reality apps. And it does so with a 10X performance-per-Watt ratio when compared to a classic application processor.
It can do so in a power envelope that is many times smaller than what a device like Kinect 2 consumes today. If you look at Kinect 2, you can see how big it is, and observe the fact that it has a fan in the back, which is always a tell-tale sign for power consumption. Going forward Project Tango and Movidius will provide a similar capability which works on smart devices. Eventually, the whole Android ecosystem and the computer industry can benefit from this research.
A new class of parallel chips
On the surface one may think that graphics processors could perform this task, so why do we need a dedicated vision processor? The answer fits in one word: “branching”. Although GPU could perform many of the tasks required for computer vision, the nature of many of the algorithms used in computer vision tend to have a lot of conditions that trigger branching — something that GPUs are fundamentally not very good with. So it comes down to a power-efficiency difference in which a chip like Myriad 1 could in theory outperform classic CPUs and GPUs for those tasks.
To achieve this feat, Movidius has built a processor that is optimized for parallelism with “tens of cores”, good branching abilities, and a low-frequency. Although Movidius didn’t share with me the exact frequency, I have been told that it was in the “hundreds of megahertz”, so that gives us some idea. Given that Power Consumption is largely defined by voltage and frequency, I can see how going to a lower frequency would help Movidius tremendously. Movidius has also optimized its design to keep the data on chip as often as possible to avoid using external bandwidth, both for speed and power consumption reasons: the moto “bandwidth is power (consumption)” has been used forever in the mobile world.
I talked to Remi Remi El-Ouazzane, the CEO of Movidius, and he was very candid about how impressive the GPGPU technology is. However, he says that at the end of the day, his vision processor is simply “extremely tuned” for the computer vision workload, and that’s why it is the best for the job. That makes sense, just like GPUs have outperformed CPUs because they are tuned for massively parallel computing (which is what Computer Graphics is), vision processors can equally be tuned for another type of workload and power budget.
And Android-based technology impulse
Right now, developers will be able to tap into the power of Myriad 1 via the Google Tango SDK which is the highest-profile project that Movidius is working on. Tango works on Android devices, but chances are that others will also jump on the computer-vision train. If OEMs want to build their own vision pipeline, Movidius supports C and OpenCL, but the company will also provide libraries that implement various computer-vision algorithms. Handset OEMs can access a turnkey reference solution that is ready to use with certain image sensors.
At the moment, Movidius is the only computer vision hardware partner in the Google Tango project.