It’s not every day that a large company like Qualcomm’s steps into a new market, and we might look back to this day as an inflection point in the cloud AI arms race.

At its A.I Day conference in San Francisco, Qualcomm has announced not only a new chip architecture but a whole new data-center platform for A.I computing that has the attention and support of significant data-center players like Facebook or Microsoft.

There are different AI workloads in datacenters, mainly divided into Training (learn new skills) and Inference (using acquired skills). Qualcomm’s new Cloud AI 100 is an Inference accelerator that was explicitly designed to provide roughly 10X higher power-efficiency than solutions deployed today. While GPUs are probably better at training because it could require higher precision (16 or 32 bits), Inference could run with 8 bit or less.

Power-efficiency is defined by how much computations one can do per Watt of power, measured in Performance/Watt. How is a 10X improvement even possible? Consider this: at first, AI convolutional neural networks (CNN) ran on general-purpose CPUs with 2-64 large cores each.

Keith Kressin, SVP, Product Management, Qualcomm

Cristiano Amon, President, Qualcomm

Then GPUs (graphics processors) came along, and they were about 10X more power efficient than CPUs at this kind of tasks. That’s because GPUs natively have a massively parallel architecture with thousands of tiny compute cores where CPUs have 2 to 64 cores. However, GPUs architectures are designed for graphics, not AI. As such there are inherent inefficiencies that could be improved upon.

Qualcomm’s 7-nm AI 100 processor architecture brings all the benefits of decades of smartphone low-power design into a chip that built from the ground-up for AI. We don’t have the exact specs yet, but from a high-level view, it seems possible that such an architecture could bring a huge difference in power-efficiency.

If you look at proven performance per watt differences between Snapdragon and PC platforms, you can easily measure a multiple in power-efficiency in day-to-day applications. Now, if you magnify that by intense computing workloads, the difference increases.

However, Hardware isn’t Qualcomm’s only strong point:  the Cloud AI 100 chip is part of a broader AI-centric effort that includes software research, libraries, and drivers. To bring even higher performance, Qualcomm is creating a set of tools, including compilers and analytics, that can improve the bandwidth by compressing CNNs, or calculate faster by using very accurate approximation (give up 1% precision to achieve a much higher performance level).

Qualcomm presented very clever ways to speed-up AI computations that revolve around the fact that AI offers a lot of opportunities for optimizations that can be found by analyzing each type of neural network.

Joe Spisak, Product Manager, Facebook AI

All in all, this is an exciting announcement for a cloud industry which is desperate to put a lid on the exponential rise of power consumption related to AI activities. Datacenters have by both space and power limitations, so more efficient architectures could provide much-needed relief to two of the most expensive datacenter metrics.

Cloud AI 100 is not a done deal yet: Qualcomm will start sampling this new product later this year for a 2020 target production. It is promising enough that both Facebook and Microsoft were on stage to show support as they can benefit immediately when the technology gets deployed.

For Qualcomm, this new market provides a very sizable opportunity for leveraging existing technologies, even though it also needs to invent new tech to make this work. To protect its potential gains, Qualcomm can build a “barrier of entry” by supporting a swath of standards from the get-go which is hard to achieve for a startup. Finally, the manufacturing scale (including economies of scale) will help it keep competitors at bay.

Filed in Web. Read more about , and .

Discover more from Ubergizmo

Subscribe now to keep reading and get access to the full archive.

Continue reading