At the GTC conference in San Jose, NVIDIA has officially introduced its next-generation architecture called Pascal (after scientist Blaise Pascal). Although it is “a couple of years away”, NVIDIA talked about a few new fundamental architecture changes that will take Pascal to the next level, when compared to its latest Maxell GPU architecture which was just launched recently on desktop and laptops.
If you remember, something very similar named Volta was talked about last year. NVIDIA has confirmed to us that Pascal is NOT a replacement for Volta. Instead, Pascal has been introduced between Maxwell and Volta as NVIDIA saw an opportunity to introduce an intermediate architecture that was significant enough to justify its development.
The Pascal architecture addresses one of the fundamental issue that has challenged computing since the very beginning: bandwidth. Although current GPU currently have hundreds of GBps of bandwidth, it is still not enough, and studies have shown that given enough data, computing performance will hit a ceiling because it takes too much time to move data around than to process it.
NVLink a new chip to chip communication system that is built on the PCI-E protocol will allow a multi-GPU graphics system to have 5X to 12X the bandwidth of PCI-E. It will be possible to scale multi-GPU to the next level. Arguably, this has some ramification in gaming, but it is really more important for scientific applications in which thousands of GPUs are required.
3D Memory is the second new tech that NVIDIA is using on Pascal chips is 3D memory. This means that memory dies are stacked on top of each other to increase capacity by 2.5X and bandwidth by “many times”, according to NVIDIA. This is the kind of things that I had seen today in Samsung SSD storage devices, but not in actual DRAM yet. I’m not sure how the heat dissipation works, but NVIDIA is confident that it will productize this within two years.
The end result, says NVIDIA, is a new GPU architecture that will vastly outperform the current generation, and that will do so at much better power-consumption ratio.
Unified Memory is something that NVIDIA mentioned in the PowerPoint slides, but has not talked about in any details. My hope is that it will make it possible for the CPU and GPU to access the same data without having to copy and waste even more bandwidth.
And of course, we can guess that the compute density will increase, with an ever larger number of CUDA cores present on the die. This was a small taste of what NVIDIA’s next gen GPU is going to be, but that’s enough to whet our appetite.