Don't hold your breath for LLM performance, because that's more RAM bandwidth limited than size limited as every token requires another pass through the weights. There is a reason HBM is so popular there.
Those tflops were functionally fixed at chip design time and with all their flexibility may never see much actual use, because paradigms shift faster than quicksand turned into silicon. To me all these NPUs are dark silicon until disproven otherwise.
Their current forte would be dense vision and sound models, perhaps several of them or slightly larger ones, but just the management around that model zoo may be far from having any useful abstractions, made worse by some of them having real-time demands. So chances of them running PC user controlled workloads are rather slim.
They sort of work on phones with fixed functional allocations to augmented cameras and audio, but on a PC their principal use is keeping constant track of the users and I don't know why I should pay for that, except to have it removed.
Of course I wouldn't mind 64 or even 128GB of RAM at commodity prices. But that trend of bleeding you for RAM may just be another carryover from the fruity cult and since it's fixed at production, the temptation to exploit your lack of choice is just too big to resist.