Each machine will have 1.3 petabytes of system memory and Cray’s ClusterStor systems will come with 26 petabytes of storage per site.
That makes it sound like each 2-processor node will have 1.3 PB. In fact, if this number is for the whole site, then each node would have about 512 GB. So, that must be what it's describing.
Incidentally, 26 PB only works out to 10 TB per node, which is fairly unremarkable. The flash storage is only about 256 GB per node.
What's so strange about this is no mention of GPUs. The numbers check out, though. 12 PFLOPS works out to about 5 TFLOPS per node, which is in the ballpark of 2x 64-core EPYCs, and pretty low for anything more than a single GPU per node. In fact, a single Tesla V100 is rated at 7 TFLOPS of fp64. Or, sticking with AMD, a MI60 would net you 7.4 TFLOPS of fp64.
So, either their algorithms don't map well to GPUs, or maybe they involve a lot of legacy code that would be too painful to port. I'd expect they'd want to use deep learning models, though. GPUs would provide the best balance of versatility and performance, for a mix of deep learning & conventional models.