[quotemsg=18486214,0,10647]Projected image absolutely has depth... I'm not sure HOW it has depth... but when you focus on an item 3' away from you that has a web page overlayed on it, the web page is in focus. When you focus on a different object 20' away any virtual objects layered on that object are in focus at that range. Finally, if you create an object hanging in midair 8' away, it's not in focus when you focus on an object behind them. Basically... virtual object layer seamlessly into the real world, there's no dof shenanigans or at least none that are at all noticeable.[/quotemsg]
From what I've played with on Unity with a basic side-by-side splitscreen viewed on a standard mobile VR goggle (don't laugh yeah)... Overall "Depth" is generally achieved using the amount of stereo separation of the left and right cameras, with according parallax of said cameras and objects in the virtual 3D space.
However, as you mention, when it comes to AR and then mapping all that onto real objects, spooky stuff...
[quotemsg=18492439,0,328798]Since the imagery is translucent, perhaps less emphasis is placed on realism. So, they needn't worry about things like lighting, shadows, or sophisticated surface shaders.
Actually, depending on the range and precision of geometry they're extracting, they could conceivably estimate the positions of real world light sources and then light (and shadow) CG elements, accordingly. Perhaps we'll see these types of tricks in version 2.0, or are they already doing it?
Clever scene graphs and LoD optimizations have long been standard fare, for game engines (seriously, I remember people talking about octrees and BSP trees on comp.graphics.alogrithms, back in '92). When it's LoD is poorly implemented, you can see the imagery shudder, as it switches to a higher detail level. Anyway, this is a nice little case study.[/quotemsg]
Currently AR is orders of magnitude "easier" to render than VR, if only because, as you mention, the style of the elements eg. wireframe overlays, floating UI.
Octrees and BSP and Culling rings a bell, but if I'm not mistaken the current issue in GPU fidelity nowadays is not pure polycount. In fact as a layperson observer I would say the next big leap for graphic engines is realtime global illumination and ultra-high (8K per surface) textures, along with "cinematic" post-processing and sophisticated shaders, eg: https://www.youtube.com/watch?v=1LamAe-k9As
"Sonic Ether" and Nvidia VXGI and Cryengine are close to cracking realtime global illumination (ie. no baked lighting, all lighting, shadowing, light bouncing calculated and rendered in realtime (90fps-160fps?)). VRAM especially with HBM2 means 8K and 16K textures are not too far away. Cinematic post-processing and sophisticated shaders are coming along too.
So yeah in terms of AR you could be walking on the street, the AR device computes the surrounding including light calculation, then renders say a photoreal person which is realtime global illumination-lit and textured, along with suitable post-processing based on the physical environment you are in (say a dusty grey day vs bright blue sunny skies). At that point you can legitimately say it is "mixed reality" since one would not be able to tell the difference between a real person standing there and the rendered artificial character.
Lurking on the sidelines is point cloud/ voxel stuff, which if/when implemented suitably, will make reality go bye-bye forever. So between 3D, VR, AR and mixed reality, well, we're at the doorstep of the final jack in your head: https://www.youtube.com/watch?v=DbMpqqCCrFQ