Google used Movidius' first gen product (along with Tegra K1 (A15 version)) in their 7" Project Tango tablet. I'm not sure whether it was really necessary (the upcoming Tango Phone looks to be Atom-based - let's see if it also uses a Myriad VPU). Since Tango is a prototyping platform, I think Google took more of a kitchen sink approach to give software developers more HW to experiment with.
To me, this doesn't seem fundamentally different than a GPU. For power-efficiency, fixed-function blocks are probably the main advantage (and by-and-large, they don't seem terribly interesting or sophisticated). But there's nothing to stop GPUs from adding those. And I don't buy the predication argument, because it's easy enough to emulate it by masking results of vector operations. If branching is more than a couple levels deep, then you need a different approach, altogether (basically, more threads).
I think Hololens probably does much more in fixed-function blocks. My guess is that Movidius probably intended richer fixed HW, but found a lack of consensus about what was really needed. Maybe that will change, in gen-3.
To be honest, what I find most intriguing is their choice of SPARC ISA. I haven't heard of anyone doing anything with SPARC in a while. Would be interesting to find out what's behind that decision.