As we head towards Mobile World Congress (MWC) 2016, the onslaught of smartphone designs using Qualcomm’s new Snapdragon 820 SoC (processor) is going to hit. Now is a good time to refresh your mind about what this chip is about, and how it is compared to its predecessors. CPU, GPU and memory bandwidth are the main performance factors to observe.
Obviously, the performance relative to “power consumption” (perf./Watt) is an important thing as well since battery life remains the #1 priority for most users. For communications, LTE or WiFi upgrades are the main drivers of improvements, so we will look into that as well.
Finally, there are items such as fast battery charging or security that can also be enabled by a new product. Here is some information from Qualcomm that shows what’s new, and how much they think the performance is better.
Built from the ground up
Don’t be fooled by Qualcomm’s numbering scheme, Snapdragon 820 is a chip built from the ground up and nearly every in it has changed. The CPU cores, the GPU, the DSP, the modem, the image processors — it’s all been upgraded from Snapdragon 810.
On this new architecture, Qualcomm claims to have made improvement across the board as seen in the presentation slide below:
Built with the 14nm FinFET semiconductor process
The semiconductor process describes the size and electrical properties of the chip’s building blocks. The smaller they are, and the higher the “compute density” potential is. Typically, new processes are used in a way that improves the power consumption for a given workload, resulting in better performance-per-watt than previous generations. 14nm (nanometer) is one of the smallest process currently used in chip production.
FinFET is a transistor design that was created to function on particularly small transistors (~20nm and below) in which electric current leakage becomes a big problem. FinFET allows chip designers to find the optimum tradeoff between power usage/leakage and computing performance.
The smaller FinFET process alone should be responsible for a significant share of the overall performance and power usage improvement (~10%?). Qualcomm is using Samsung’s 14nm FinFET (gen2) process that was initially used for the Exynos processor of the Galaxy S6-series.
Snapdragon 820 is a system-on-chip (aka SoC) which is composed of many compute sub-units such as the Kryo CPU cores, the Hexagon 680 DSP, the Spectra image processor, the Adreno 530 graphics processor and the X12 4G LTE modem — just to mention the big ones.
General purpose computing: Kryo CPU
The performance “bread and butter” of any SoC comes from its central computing units (CPUs) and memory sub-system. This is the equivalent of your CPU (Intel/AMD) in a PC.
Snapdragon 820 is a quad-core processor that comes with four identical Qualcomm Kryo custom 64-bit cores. Two of them run at a maximum of 2.2GHz (best performance), while two others run at a maximum of 1.6GHz (best power). As you may have guessed, the handset can utilize them at different times, depending on the workload at hand. The idea is similar to ARM’s big.LITTLE idea, except that the configuration is different."THE SINGLE-CORE PERFORMANCE IS IMPRESSIVE"
Kryo is compatible with the ARMv8-A instruction set that was developed for the ARM A35/A53/A57/A72 cores. From a software standpoint, Kryo behaves just like those ARM cores, but internally, Qualcomm has built the Kryo core from the ground up to use its unique optimizations. This allows Qualcomm to pick its own transistor budget and tweak the Kryo core in an optimal way.
Qualcomm will compete with chip designs from Samsung (Exynos) and Huawei/HiSilicon (Kirin) that may feature more cores (is it better? Not always), so I expect the competition to perform better in some synthetic tests that run across all cores at once, like Geekbench and others multi-threaded benchmarks.
Multithreaded (MT) is not that important in my opinion. For example, Apple’s dual-core processors typically lose in MT benchmarks, but still perform very well in the real world, because synthetic tests on a massive 6-8 array of cores are a very poor proxy for real app CPU usage. The single-core performance is impressive, but Apple remains ahead of the game.
The memory sub-system has been greatly improved, and data can flow much faster when compared to Snapdragon 810. Independant synthetic tests show between 30% to 100% faster data stream from one generation to the other depending on the task.
Hexagon 680 DSP (co-processor)
that is a specialized unit. Such computational units were originally designed to work on information streams such as audio or visual signals. Nowadays, they are stream processors that have massive amounts of computational power, thanks to the use of parallel units operating on wide data types. They work best under the assumption that they operate on enormous quantities of relatively homogeneous data that require to be processed with the same instructions/program"DSP POWER INVISIBLE IN BENCHMARKS. TOO BAD"
Besides the raw speed, the Hexagon DSP is also valuable for consuming less power. If used properly, the DSP can efficiently offload work from Kryo cores, resulting in faster computing and better battery life. It is extremely to compare the real-world added-value of a DSP because there are no standard ways to measure and compare between several SoCs.
Note: because DSPs and ISPs aren’t standardized across handsets, benchmarks typically don’t take directly them into account, so their computational power do not show in most benchmarks. That’s very unfortunate.
Graphics processor: Adreno 530
The graphics processor (GPU) is the single largest generator of performance data points in gaming benchmarks. That’s because 3D and 2D graphics rendering operations cannot be performed as efficiently by the CPU or the DSP.
"A REAL 50% BOOST IN SOME INSTANCES"The Adreno 530 graphics unit is designed to be 40% faster in absolute terms, and 40% more power efficient for a given workload. We’ll look at real-world numbers a bit later, but it’s good to have a frame of reference. Additionally, Adreno 530 can access the same memory as the Kryo cores so they both can manipulate and share the same data without wasteful transfers.
An important addition to Adreno 530 is support for OpenCL 2.0. This enables the GPU to perform non-graphics computations such as Physics or video-compression. Here’s an example of massive physics computing with OpenCL, on PC with an AMD GPU:
Game benchmarks show that Snapdragon 820 is very fast. The improvements can vary greatly from one benchmark to the other, with 3DMark IceStorm Unlimited scores being moderately higher than the Galaxy S6, while GFXbench shows a real 50% improvement over that same foe. The origins of this discrepancy can be varied, but overall, you can expect a significant speed gain in real games.
Spectra ISP (Image Signal Processor)
The Spectra ISP unit is not part of the GPU, but I’ll talk about it here while we are into “graphics”. The stream of data coming from the cameras is typically processed by the ISP, and the data can then be routed to other units later, such as the GPU and CPU if specific processing needs to be done.
Qualcomm says that Spectra can process up to 3 data streams coming from 25 Megapixel cameras. That is a vast quantity of data, and having three cameras can open new options for phone makers, such as having stereoscopic input, or multi-angle cameras such as the LG V10.
Next-gen 4G LTE with the Qualcomm X12 LTE Modem
"U-LTE IS A VERY SMART WAY TO INCREASE NETWORK THROUGHPUT"Snapdragon 820 has an integrated Qualcomm X12 4G LTE modem. It is an LTE CAT12 (Category 12) modem that can (in theory) reach peak transfer rates of 600 Mbps (download) and 150 Mbps (upload). Real-world conditions will vary depending on one’s position relative to the cell tower and other factors, but that’s what the hardware is capable of doing under the best conditions.
This speed is possible because X12 connects to the wireless infrastructure using three 20MHz bands simultaneously (carrier aggregation), a feat that Snapdragon 810 could not do.
X12 also brings LTE-U (LTE in Unlicensed spectrum) which would allow wireless carriers to extend their coverage by using the 5GHz frequency that is commonly used by WiFi. “Unlicensed” means that carriers did not pay for an exclusive “license” to operate in a given spectrum (range of frequencies).
LTE-U isn’t like WiFi calls because despite operating on the same 5GHz frequency, LTE-U does not use the WiFi protocol, but the LTE protocol. LTE-U was created and proposed by Qualcomm to deal with an exponentially growing use of wireless data. T-Mobile and Verizon are expected to deploy it in the U.S. U-LTE is a very smart way to increase network throughput.
The Snapdragon 820 processor is an excellent chip. With it, Qualcomm comes back to building custom CPU cores, a method that served it very well over the past decade. At the same time, the company keeps pushing on graphics and modem technology to maintain a sharp competitive edge. The ensemble forms a solid technological foundation for handset partners to build on.
I have considered Snapdragon 810 to be an intermediate processor which Qualcomm knew would not have a lasting architecture. Maybe it was the result of a jump to 64-bit that was forced onto Qualcomm by the market. With the arrival of the Kryo core, Qualcomm is back to a normal pace and the company can put all of its energy back into its line of products with more focus.
In the meantime, the SoC market has become more competitive than ever, and we are looking forward to comparing retail handsets powered by Snapdragon, Exynos, Kirin and other processors to see which performs best in the real world. Stay tuned!