One of the highlights of Hot Chips 2019 was the presentation of the Cerebras Wafer Scale Engine – an AI processor chip that was as big as a wafer, containing 1.2 trillion transistors and set at over 46225 square millimetres of silicon. This was enabled through breakthrough techniques in cross-reticle patterning, but with the level of redundancy built into the design, ensured a yield of 100%, every time. The first WSE system, the CS-1, was put out on display at Supercomputing 2019, where we got a chance to bite into the design with Andrew Feldman, the founder and CEO of Cerebras.
Unfortunately I never got around to writing up my discussions with Andrew, however what we did learn at the time is that the CS1 is a fully integrated 15U chassis that requires 20 kW of power to push to the chip through 12x 4 kW power supplies (some redundancy built-in). The chip is mounted vertically for the sake of ease of access, which is quite bizarre in the modern world of computing. Most of the chassis was custom built for the CS-1, including the tooling and a fair amount of commercial 3D printing. Andrew also said at the time that while there was no minimum order quantity for the CS-1, however each one would cost ‘a few million’.
Today’s announcement from the Pittsburgh Supercomputing Center (PSC) helps round that number down to perhaps ~$2 million. Through a $5 million grant from the National Science Foundation (NSF) to the PSC, a new AI supercomputer will be built, called Neocortex. At the heart of Neocortex will be hardware built in partnership with Cerebras and Hewlett Packard Enterprise.
Specifically, there will be two CS-1 machines at the heart of Neocortex. The CS-1 supports asynchronous models through TensorFlow and pyTorch, with the software platform able to optimize the size of the workloads for the available area on the CS-1 Wafer Scale Engine.
The pair of CS-1 machines will be coupled with an ‘extreme’ shared-memory HPE Superdome Flex server, which contains 32 Xeon CPUs, 24 TB of DDR4, 205 TB of storage, and 1.2 Tbps of network interfacing. Neocortex is expected to be used to enable AI researchers to train their models, covering areas such as healthcare, disease, power generation, transportation, as well as pressing issues of the day.
The machine will be installed in late 2020. PSC has stated that access to Neocortex will be available to researchers in the US at no cost.
When we spoke to Cerebras last year, the company stated that they already had orders in the ‘strong double digits’. When pressed, I managed to get that from ’12 to several dozen’. A number of machines were ordered for the Argonne National Laboratories at the time, and I suspect others are now investing.
Interestingly enough, at Hot Chips 2020 this year, the company is set to disclose its second generation Wafer Scale Engine. At a guess, I would suggest that this is slightly further away to commercialization than WSE1 was when it was announced, but the company seems to have had substantial interest in their technology.