What's Inside Microsoft's HoloLens And How It Works

Status
Not open for further replies.
Excellent writeup!

I'm still not clear on whether the projected image has real depth of field, or is merely projected on a single plane. If the latter, is the plane fixed in space, or can it be focused at different depths? I was hoping it had a light field projector, but it sounds like the image is just 2D. I guess there's always Magic Leap...

Regarding the HPU, I assume the reason they opted not to use the SoC's HD Graphics GPU was that it's needed for rendering. And basing the HPU on DSPs probably made it easier to augment with custom hardware accelerators than if they'd based it on a GPU. Otherwise, a GPU would probably be more power-efficient.
 
Projected image absolutely has depth... I'm not sure HOW it has depth... but when you focus on an item 3' away from you that has a web page overlayed on it, the web page is in focus. When you focus on a different object 20' away any virtual objects layered on that object are in focus at that range. Finally, if you create an object hanging in midair 8' away, it's not in focus when you focus on an object behind them. Basically... virtual object layer seamlessly into the real world, there's no dof shenanigans or at least none that are at all noticeable.

The Hololens design is the real deal... top shelf stuff given it's still in the tech demo phase. Only real issue is that the lens design introduces halo artifacts. It's not a deal killer for the device (I'm still blown away by it), but it is noticeable and I'm sure high on their list of things to reduce/eliminate.
 
Knowing more about the Hololens is making me interested. At first, I thought that it was just some experimental gadget that would go nowhere (like google glass), and that I would still have to wait many years before another company attempted this and succeded, but I am starting to feel this is the real deal.
I really hope this takes off, I would like to try this.
 
That's even more amazing, given that it seems they're only rendering with the downsized HD Graphics GPU in the Bay Trail SoC (about 1/2 to 1/3 of what you'd find in a desktop CPU).

Heh, this is like what Google Glass 3.0 wanted to be. It's a massive leap-frog by MS. I was stunned by the announcement and launch demos, quite frankly.

I get the feeling it even puts Intel's Project Alloy to shame. As far as I know, Magic Leap is the only other thing on the horizon that can even approach it. I actually wonder which cost more to create. Magic Leap is up to like $1.3B in funding, right?
 


We get easily jaded as tech journos, but I have to tell you, this thing is amazing. I tried Google Glass back when it was a Thing...it wasn't great. The HoloLens is far and away superior in every way.
 


*Cherry Trail, and that was my biggest question when they first announced it. If you need a beastly PC to power VR experiences, how on earth was Microsoft doing this with a small, self-contained HMD?

But if you think about it, it totally makes sense. To oversimplify: In VR, the GPU has to render *everything*. Entire worlds. With AR (which is what this is, with apologies to MSFT's insistence on calling it "mixed reality"), you're rendering just *one thing*. And of course the sensors and HPU handle the task of keeping the image(s) in place.

Alloy is a totally different beast altogether...I have much to say about it (stay tuned), but you can think of it as both VR and AR, in a way. It's a totally occluded headset (like Vive and Rift) as opposed to having a passthrough lens (like HoloLens), but it uses the RealSense camera to "see" the real world and recreate it inside the HMD for you to "see."

I wrote about some of the tricks devs are using to optimize the GPU resources, though. From the mouths of JPL scientists using HoloLens for mission-critical applications: http://www.tomshardware.com/news/vr-ar-nasa-jpl-hololens,31569.html
 
Thanks for the correction. It was a slip, on my part, but it's interesting to see that the x7-Z8700 did launch way back in Q1 2015.

Since the imagery is translucent, perhaps less emphasis is placed on realism. So, they needn't worry about things like lighting, shadows, or sophisticated surface shaders.

Actually, depending on the range and precision of geometry they're extracting, they could conceivably estimate the positions of real world light sources and then light (and shadow) CG elements, accordingly. Perhaps we'll see these types of tricks in version 2.0, or are they already doing it?

Thanks for the link. This sounds specific to their app and landscape rendering, rather than Hololens, generally.

Clever scene graphs and LoD optimizations have long been standard fare, for game engines (seriously, I remember people talking about octrees and BSP trees on comp.graphics.alogrithms, back in '92). When it's LoD is poorly implemented, you can see the imagery shudder, as it switches to a higher detail level. Anyway, this is a nice little case study.
 


From what I've played with on Unity with a basic side-by-side splitscreen viewed on a standard mobile VR goggle (don't laugh yeah)... Overall "Depth" is generally achieved using the amount of stereo separation of the left and right cameras, with according parallax of said cameras and objects in the virtual 3D space.

However, as you mention, when it comes to AR and then mapping all that onto real objects, spooky stuff...



Currently AR is orders of magnitude "easier" to render than VR, if only because, as you mention, the style of the elements eg. wireframe overlays, floating UI.

Octrees and BSP and Culling rings a bell, but if I'm not mistaken the current issue in GPU fidelity nowadays is not pure polycount. In fact as a layperson observer I would say the next big leap for graphic engines is realtime global illumination and ultra-high (8K per surface) textures, along with "cinematic" post-processing and sophisticated shaders, eg: https://www.youtube.com/watch?v=1LamAe-k9As

"Sonic Ether" and Nvidia VXGI and Cryengine are close to cracking realtime global illumination (ie. no baked lighting, all lighting, shadowing, light bouncing calculated and rendered in realtime (90fps-160fps?)). VRAM especially with HBM2 means 8K and 16K textures are not too far away. Cinematic post-processing and sophisticated shaders are coming along too.

So yeah in terms of AR you could be walking on the street, the AR device computes the surrounding including light calculation, then renders say a photoreal person which is realtime global illumination-lit and textured, along with suitable post-processing based on the physical environment you are in (say a dusty grey day vs bright blue sunny skies). At that point you can legitimately say it is "mixed reality" since one would not be able to tell the difference between a real person standing there and the rendered artificial character.

Lurking on the sidelines is point cloud/ voxel stuff, which if/when implemented suitably, will make reality go bye-bye forever. So between 3D, VR, AR and mixed reality, well, we're at the doorstep of the final jack in your head: https://www.youtube.com/watch?v=DbMpqqCCrFQ

 
Why are you concerned about huge textures? GPUs have all that compute horsepower in order to procedurally generate textures as needed. The resolution of procedural textures is limited only by the precision of the datatypes used to specify the texture coordinates.

Sure, not everything lends itself to coding as a procedure, but most things can at least be decomposed into a macro scale texture map, and smaller, repeating textures. I just don't see any need for 8k or 16k textures. I mean 4k makes sense for mapping screens, windows, and video images onto things, but that's about it.

I think that's not what anyone means by "mixed reality".

Anyway, there's a lot you're oversimplifying. Light source estimation will never be perfect, in unconstrained AR applications. It can be "good enough", in most cases, so that rendered objects don't seem jarringly out of place.

But you're also glossing over the whole display issue. Hololens doesn't block the light arriving through the visor. So, you'd be talking about something like Intel's Project Alloy, which is a VR-type HMD + cameras.

Safe to say, it'll be a while before we need to worry about a "Matrix" scenario. I don't even see it happening with conventional silicon semiconductors. Maybe carbon nanotube-based computers, or something else beyond lithography.
 


Again I'm just commenting from the point of view as a layperson graphics enthusiast.

I was commenting on various questions regarding VR/ AR and 3D technologies involved. Perhaps my vision is too forward-looking and goes beyond the current scope of Hololens. But in regards to some of the questions on this thread, I would still say that global illumination, ultra-high-res textures (procedural is useful but I don't think can yet be used for total photorealism), and cinematic post-processing will be the coup-de-grace for sophisticated and convincing VR/ AR and what I understand the goal of "mixed reality" is beyond simple translucent UI overlays.

However, coming back to present-day earth, like you mention Hololens doesn't block the light per se... I wonder if future iterations will. Project Alloy has an issue in "mixed reality" if what you're looking at is through a camera, in trying the Samsung Gear VR's camera feature it's still disorienting because an external camera can't (yet) match a natural perspective of real eyesight.

But if Hololens can display an image which appears in realspace while you still look through the visor, and that displayed image obscures what's behind it (or has 0-100% opacity control), I think that's quite impressive (and scary), and I wonder if that's what they really want to achieve when they use the term "mixed reality".

As for the "Matrix" scenario that's probably off-topic but sub-10nm silicon semiconductors would be indeed too crude an implementation.
 
Any chance that MSFT is using Waveguide displays from Vuzix? Both Hololens and Vuzix seem to be based on Nokia waveguides. Pasi Saarikko who used to be a lead engineer for waveguide displays at Nokia changed to Vuzix and was later working for Microsoft. Just search for "Pasi Saarikko" at LinkedIn. Vuzix also shipped waveguide displays worth 800,000 USD to an unknown party according to quarterly reports.... Vuzix published a video recently which shows a PokemonGo character viewed directly through Vuzix waveguides. The image quality seems to be comparable to Hololens. Intel, who was involved in Hololens development bought a 30% stake in Vuzix in January 2015. Any thoughts on this?
 


Interesting question. I don't know the answer, but as we learn more about HoloLens and have the chance to talk to more engineers, this is something we can ask about. (We want to know all the things!)
 
Regarding DOF rendering on Hololens... after playing with the device I think I understand what it's doing - and in some ways it's even MORE impressive. When you enter a space it (in the background usually) maps the 3d physical space. Then when you drag a window around that space it uses that 3d map to snap the window onto appropriate surfaces. You can optionally (while you're dragging) manually move that window closer or further away. When you drop the window it locks it in place in the space. Then you can walk around it and it's just another object in the space. It also keeps a database of mapped spaces and appears to be VERY good at recognizing when you return to a space you've been at before. I took the headset home a month after I'd originally played with it and it recognized all the spaces in my house and re-inserted all the objects I'd originally dropped onto various surfaces. There's a lot of very cool tech in this device.
 
Status
Not open for further replies.