I already heard some of these specs on a news program's coverage of The Sphere. I turned to it in the middle of the segment, so I didn't know what they were talking about. The numbers were so mind-blowing that it immediately caught my attention.
This is one of the first times I've heard of something in Vegas I think I actually want to see.
They each rely on a collection of 27 nodes, each streaming at 4K through Hitachi Vantara's software, with a whopping 4 Petabytes (1 Petabyte = 1000 TB) of flash memory capable of 400 GB/s speeds.
IMO, it's more interesting to consider the specs of the individual nodes. So, 4 PB = 148 TB per node. That works out to about 4.6 * 32 GB datacenter SSDs per node, which is very plausible.
The data rate works out to 14.8 GB/s, so they're definitely using a RAID of some sort. While the fastest PCIe 5.0 client SSDs can basically hit that, I doubt they can reliably
sustain those speeds.
I do wonder what data rate they're actually using. If we consider storing an hour of footage in 148 TB, it works out to 41.1 GB/s. Now, I know they can rotate between different shows, so probably the storage capacity should hold more like 3 or 5 hours of footage, but that's still an astonishingly high data rate for compressed video - even at 4k.
A raw 4k frame @ 4:4:4 and 16 bits per channel is only 49.7 MB. So, that's about 3 GB/s at 60 fps. 148 TB would let you store 13.78 hours at 60 fps. If we double the frame rate to 120 fps (which makes sense, if you imagine their display panels probably reuse some circuitry from commodity OLED TVs), then we end up with 6.9 hours of storage capacity, which is roughly where I expect it would be.
Huh. So, they're really storing
uncompressed video on these nodes? I wonder why. Sure, the simpler your data path, the fewer things can go wrong, but talk about brute force...
I've heard digital cinema uses MJPEG or JPEG 2000. The reason probably being that an error in the datastream would only cause a glitch to occur in part of a single frame, rather than affecting the rest of the GOP. Also, no chance of motion artifacts.
Full 4:4:4 chroma subsampling is also used, and reportedly, the displays can achieve a latency of around five milliseconds or less.
I wonder if they even use YUV. For 4:4:4, you could just encode RGB and even avoid the colorspace transform when displaying.
I also wonder a little bit how they keep the playback machines & displays synchronized. For the machines, I'd guess NTP at a high polling rate? I think synchronization at about 1 ms should be adequate, though not excessive if you consider the amount that an image can move during rapid pans (having poor synchronization would result in tearing). As for the displays, "Gen-lock" is nothing new... it dates back to the era when TV stations would switch between analog feeds, and you needed each signal source to be at the same v-sync to avoid glitches. I've even seen it listed as a feature on some Nvidia Quadro cards, a while back.
Last among my questions is what OS they're running. I'd guess probably Linux with "RT" patch set. That's what I'd use, any way. We should also consider that they're utilizing a Hitachi storage solution targeted at a much larger application domain, which I doubt supports true RTOS', nor do I expect it'd be worth porting to one.