AMD's Mantle API Gives Devs Direct Hardware Control

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Can you give links where amd themselves saying the software will be open source? AFAIK amd only talks about the software are open for anyone that interested including nvidia but so far there is no evidence that i can see that Mantle will be open source. Being open to use does not equal the said software being open source.
 
What's this stuff you gamers keep getting your hopes up with gaming OSs? Low level API does not mean they are opening specifications to their hardware, they are not open sourcing their drivers(weren't they already opensourced for amd? Don't remember..), any way it would be unpractical to write a whole new general OS that would run on every PC. An "os" for custom hardware, maybe, but you already get that with your PS4 or xbox or steambox or what not. So does anyone know if they released any documentation yet?
 


They're still writing it... its still in the "theoretically it should be better" phase.
 


Ahhh, but it is cross hardware. You just use the drivers to compile your code. AMD's new GPUs are fully C/C++ capable. As long as your GPU supports C/C++, you just have your code compiled. Which is done at run-time for Mantle.



It's not the translation that's expensive, it's the system call itself. Crazy expensive. The average current game does a maximum of about 3,000 calls per second, your best PC game that has been highly optimized and has very intelligent programmers who have a lot of experience with GPUs may get 10,000 draws per second. Mantle will get you about 100,000 draws per second.

That also ignores the scaling Mantle brings. It is relatively simple to add an extra video card and double your performance in compute loads.

 

There is no such thing as a "C/C++ capable" CPU/GPU/whatever since almost any CPU can "support" C/C++ if given a proper compiler. The caveat is in the amount of work-around for instruction set and architectural deficiencies implementing C/C++ may require: even 8bits microcontrollers from 12+ years ago that were never meant to be programmed with anything more advanced than assembler language can "support" C++, albeit at a horrible code size and memory cost since they lack the hardware resources to support efficient function calls, never mind virtual ones.

With enough effort, you can fit a square peg in a round hole.
 
Like I said, theoretically, it should be better. They have yet to show anything or even theorize what the additional draw calls will do for the end user in a head's up comparison. The most they've said is that they wouldn't be wasting their time for a few percent performance increase.
 


To fully implement C/C++, your processing unit must support certain features and many GPUs do not, until recently. AMD went through the hassle to make sure their GPUs can.

While someone could make work around and effectively emulate the features, most of the band-aids would probably make it better if you just used use the CPU.
 

Likely because there isn't that much to theorize about: ~1/9th the overhead per call x ~9X as many calls per scene = roughly the same amount of time spent calling the API so the net result may end up roughly the same. What that does do is allow programmers to write 3D code more naturally, such as one or two draw call per object instead of packing multiple objects or similar surfaces per call to mitigate the amount of time spent on DX/OGL API calls. Frame rates may not necessarily increase by much since the total amount of back-end rendering power is still the same but they could become smoother due to the more steady flow of stuff to render.

As for compute loads, I doubt Mantle can do much about scaling on its own there: only the application developer knows his algorithms' specifics and how best to partition the problem across multiple compute resources to minimize traffic and latency over PCIe and system RAM so manual intervention is required regardless of API for optimal results. Wasting even 1% of processing time on inter-process communication limits useful scaling to about 100 threads, which becomes problematic when you are trying to extract performance out of a multi-GPGPU setup with 2000+ threads.
 

If its turing complete, then it should be able to handle C/C++ regardless of architecture. Not necessarily efficiently though.
 

Supporting something and EFFICIENTLY supporting something are two different things.

To be able to fully support C++, you need little more than an architecture that supports using registers to do register-based jumps or the ability for software to edit the stack to the same effect and a compiler that implements the necessary tricks. Yes, it may be horribly inefficient on small microcontrollers and other architectures not intended for it (as I said in my original post) but it works. Sure, most programmers would not bother but hackers get a kick out of making hardware do stuff it isn't supposed to do just to prove that it can be done regardless of how impractical it might be.

 


If it can't compete, then it doesn't work. If I had a sorting algorithm that had a O(!N) complexity, I wouldn't be bragging about it. Ohh, it'll sort your large dataset, just not in your lifetime. Well.. not sure that I would say it's usable.

In the real world, "technically works" is not the same as "It works". Unless time or energy is not a concern.
 

What can and cannot compete is relative: it does not matter if a given architecture cannot make efficient C++ virtual function calls if none of those calls occur in a performance-critical code path or the whole application is not time-critical at all in the first place.
 


Cross AMD GPUs aren't binary compatible either, that's why each GPU has it's own driver, the driver compiles. Mantle only requires that a GPU support certain features. If a GPU does, then it's just a matter of implementing the interface. Nvidia can implement the interface and it'll work.
 
Status
Not open for further replies.