5870x2?

Page 6 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
For me, this idea if Ati is cool and all, but would the games be written to take advantage of the new technology? This can easily be seen from the eyes of sli and crossfire as not all the games benefit from either one.

Also, will the drivers be able to command such a beast (4 cores)? I only hope that it wouldn't be another 4850x2 (bad everything).
 
its down to how the processors will be optimised for running typical graphics applications. hard to say without actually looking at some of the computationally expensive algorithms used in computer graphics, and then looking at how they'd run on either parallel architecture. so i'll refrain from resorting to conjecture.

but at any rate having your unified shaders split across multiple cores will lower performance rather than increase it -- though whether the difference is immeasurably small or substantial enough to have an impact remains to be seen. if they go for multicore architectures a number of issues can arise in such a scenario, as the cores now how to communicate over the bus rather than directly with each other. and all your typical multiprocessor architectural considerations come into play. that can bring up typical concurrency issues including synchronisation, mutual exclusion, timing, prioritisng use of the bus, capability for disabling the bus, issues related to management of shared memory. multi cores may also require the use of shared memory + common memory or some such related scheme.

but using additional cores will often be the cheaper, easier choice.

NVIDIA's strategy when it comes to building GPUs seems to favour a single powerful monolithic design. while ATI seems to want to compete by using multiple weaker designs instead of monolithic designs. both techniques could be considered equivalent as far as performance goes. but i wouldn't expect there to be a big advantage in a multicore design, nor get overly excited by it.

the reason processors with multicores is an interesting proposition is because, by the very nature of their design, they don't allow for parallelism. so multicore architectures are a very welcome evolution. on the other hand, GPUs are designed for parallelism from the ground up and use "multiple processing elements" in the hundreds. i just dont see there being any advantage for adding multi cores other than a simpler way of scaling up performance without investing too heavily on R&D.

Edit:
@soundefx, as for games i'd expect the scaling for additional GPUs to be more closely related iwth how the hardware and driver interact with DirectX11 or OpenGL rather than directly with the games, unless that specific game is designed to use a "GPU-accelerated software renderer" (a fresh concept that has recently come up). As long as the driver is optimised to efficiently map the DX11 api on to the hardware, we will probably not have any problems with games... at least in the majority of the hardware. that's what i'd expect anyway.
 
Ive read, and unfortunately cant find it anymore 🙁 that DX11 and compute shaders will help in scheduling immensely, they showed graphs broken down to the milli second? where its almost always the gpu timing as being great, but shows delays when the cpu is being used, which should help go away, as this wasnr being done on W7, and didnt incorporate W7 usage of MT thru the gpu pipe.
ATI has claimed their scheduler is over twice as complex as before, and the over theory isnt all functions shared, only those compatible.
Like I said, Im only a layman here, dont get it all, but it does have much better heads than mine getting scratched ATM
Whats interesting too is the timing. This all started 2+ years ago, these rumors, about the time they were working on the 5xxx series for startup, tho thats just coincidental

Correction, it was done on an early W7 release
 
http://www.neoseeker.com/news/9078-ati-hd5870-rumors-1-5-tflops-40nm-1000-shaders-and-multi-core-/ this is some very old rumour. but if there is any truth to this, then 1.5TFLOPS will be the performance of the RV870. The RV800 will be dual RV870 and have 3TFLOPS assuming 100% scaling.

And it is possible ATI might go for a dual card design and do a "5870 X4" which will two RV800s = 4x RV870. this should put the card at 6TFLOPS... nothing to scoff at.

however, "GTX 385" or equivalent is expected to do a minimum of 3TFLOPS. So the RV800 -- "5870 X2" (?) -- will compete with this, while the top end card will probably be something similar to the GTX295 -- i.e. dual "GTX 385" (or maybe dual GTX 375?) on a single card producing 6TFLOPs.


whatever the reality turns out to be, it looks like graphics performance is going to double or treble. not bad news by any measure.
 
Rumors running ATI ditching AFR for SFR, maybe. Could play into DX11 abd CS' abilities.
Read also that DX11 is dumping more onto the gpu, tho I spose it may help the scheduling as well, maybe a tradeoff?
Been readong some siggraph papers. Seems DICE wants DX11 badly, and is making their new engine explicitly so
 
nice. yeah i'm trying to find the papers on those dx11 scheduling algorithms you mentioned. i'm guessing its got something to do with the thread processor on current dx10 gpus -- no idea what dx11 gpus look like architecture-wise as of yet , but i'm guessing thats the parallel.
 
Yea, theyll run on current DX10 , provided support
Ive got the siggraphs
http://s09.idav.ucdavis.edu/
Yea, it was a small program and a "timer" if you will.
What I was surprised at, if this plays out in real life, it could help end micro stutter, since the scheduling, interconnects etc are all ramped up, as well as the cpu/gpu MT.
Tho, if not coded right, or over loaded (yeah right) it may eventually have no effect
Tho we could see more Crysis type games down the road
 
interesting looking course jaydeejohn. Is it a one off thing at SIGGRAPH 2009 or is it a course that will also delivered at UC Davis?

some algorithms are ideal for parallel implementation on hardware such as GPUs. but the biggest problem with implementing general purpose algorithms on GPUs is the programming complexity. Sequential algorithms with low parallelism on CPUs is a much simpler affair. It's a lot harder to implement algorithms to run correctly on massively parallel hardware. It is also a lot harder to verify. When the program needs to be optimised around thousands of threads executing (largely) concurrently, its not an easy feat from a programming perspective.

I have a cousin who does his post-doctoral research in that area: implementing GAs (Genetic Algorithms) that "evolved" a solution to parallel programming tasks across some 100,000 processors.

systems along these principles on CUDA or DX11-compute powered GPUs could possibly be a promising step in the right direction. Looks like NVIDIA is also trying hard to support and assist startups around CUDA -- so i guess it does see CUDA in its roadmap for years to come.

This paper outlines the use of GAs in scheduling for supercomputing clusters - a problem that belongs in the hardest set of non-deterministic polynomial time problems. unfortunately u have to pay to download it ... 🙁
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V12-4CDS49V-2&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&_docanchor=&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=516bf94802da059a53a72d1d258c1d52

EDIT: http://www.nvidia.com/object/gpu_ventures_program.html Here's the link for anyone planning to start a company based on CUDA 😛
 
I see a parallel (heheh) in GA usage and F@H. Which also explains the wider usage of cpus abilities, but also shows how slow they are within a highly parallelised solution, with the multicored consoles somewhat inbetween.
As cpus gain in cores, and gpus learn how to loop/predict/correct better, while also being pushed more towards non fixed fubction, and SW doing it all, and DX11 CUDA and opencl mature, eventually therell be a crossover.
Right now, I do know gpus are "cheating", and hiding their latencies to some extent.Having opencl will eventually make the gpu makers conform in this direction, making improvements, while also, LRB will have its own impact, which can be borrowed from from either side to an extent
 
yeah, will be good to see how DX11 implementation turns out. and OpenCL. I only ever used OpenGL and that was with GLUT in C, rather than directly, so that it would compile with both gcc and VC++ 6.0.

I downloaded the CUDA documentation from nvidia. it's a LOT simpler to follow than even your average microcontroller or microprocessor datasheet. very nicely written.

as i suspected the CUDA architecture is a set of extensions on top of ANSI C. typically with dedicated or embedded processors, languages can be very different from their standardised ancestors. Michrochip's PIC microcontrollers and Digital Signal Processors for one use a very different version of C completely incompatible with ANSI C and looking quite a bit like the underlying RISC assembly language that is native to the processor. for that reason with embedded processors i usually just used the assembly language instead. but here the code looks a lot more like standard C.