BambiBoom writes:
> mapesdhs, Many excellent points.
Thanks! 8)
> Yes, I agree completely that pure clock speed is useful and desirable
> in workstations, my point was that if I were predominately rendering,
> I would rather have more cores / threads than a high clock speed. ...
There's the irony though; due to the efficiency losses when spreading
a rendering load across multiple cores/threads, a higher-clocked CPU
with fewer threads does a lot better than one might expect.
For example, what score do you get from your Dell T5400 for running the
Cinebench 11.5 benchmark? My Dell T7500 gives 10.90 (2x X58 XEON X5570,
8 cores, 16 threads @ 3.2GHz), whereas my 5GHz 2700K gives a very
impressive 9.86 despite it only having half as many cores. Meanwhile,
the Dell is beaten even by a stock 3930K (scores 11.13) while my current
3930K oc (4.7GHz) gives 13.58. Of course there are differences such as
ECC RAM, etc., but even so, the way the higher clock makes up for having
fewer cores is fascinating. Have a look at my CPU results:
http://www.sgidepot.co.uk/misc/tests-jj.txt
Since high clocks then benefit certain pro apps to such a strong
degree, no wonder an oc'd consumer chip gives good results when
paired with a pro GPU.
The sad part is it looks like there will not be a desktop enthusiast
8-core Haswell. Sheesh, two generations on and still no 8-core for X79. :\
Note that some 'consumer' enthusiast X79 boards do support a range of
XEONs, eg. the Asrock X79 Extreme11 can use the 8-core SB-EP E5, and the
board does support ECC RAM, but alas of course those CPUs are costly and
don't have unlocked multipliers, so an oc'd 3930K would outperform a
single E5-2687W. Such CPUs make more sense on multi-socket boards where
as you say they really fly (two 2687Ws score about 25+ for CB11).
It's likely that being able to run the RAM on my non-Dell systems at
much higher speeds helps for some tasks aswell. The T7500 is locked to
1333, whereas I'm running 2400 with my 3930K, and 2133 with the 2700K.
> ... But yes, I'd love a couple of twelve core Xeons at 4.5GHz. ...
Compromise - sell your soul, get a
UV 2000. ;D
> 12-15 core, use DDR4 ,and be quite fast, though I've not heard any
> specific number. Intel seems to do development from the lower speeds
> at first.
Progress does seem somewhat slower now though. Lack of competition once
again perhaps.
> Your comments are also very welcome as you mention some of the
> important experiential qualities that come into play when using
> workstation applications. ...
Thanks! Many people who don't use pro apps seem to think judging one pro
product vs. another is a simple benchmark comparison process, but as you
clearly are aware from your own experiences, reality is a lot more complex.
One of the most impressive 3D demos I ever saw was on an
SGI Onyx RE2
almost 20 years ago (1995 I believe; I was doing Lockheed-funded
undersea ocean systems VR research at the time). A multi-screen
simulation system created by BP, it depicted a small section of an oil
rig, complete with real-time shadows (something the BP guys thought was
very important for safety testing and assessment), rendered with full-
scene subsample AA, etc. See:
http://www.sgidepot.co.uk/misc/oilrig.jpg
The update rate was about 7 to 10Hz. At first I thought that was kinda
slow, but then the BP guy explained how the system worked...
They did not have the model available in the native
IRIS Performer
database format used by the gfx system for rendering real-time 3D
scenes; instead, they only had their multiple internal proprietary
databases which their engineers use every day for their work, eg. steel
infrastructure, electrical piping, gantry ways, water supplies, emergency
systems, cooling ducts, etc.
BP wanted something that could bring these all together to give them a
broader picture and to reveal things which the separate databases couldn't
show, eg. if a gas feed pipe in one database was spatially going to
interfere with a gantry way from another database. Preventing such issues
can saves hundreds of thousands, perhaps millions when a maintenance job
includes the attendance of other seas vessels at the rig.
Thus, the system combines and converts ALL these databases into a
Performer database again and again and again for every single frame.
Though the frame rate was low at the demo (note the later
IR gfx runs it
at 60Hz no problem), I was more impressed by it than some of the other
demos because I realised the others were not pushing what the system
could really do (at least not according to the tech guy I spoke to
afterwards who turned out to be one of the designers of the later IR
gfx), whereas the oil rig demo was showing something fundamental about
the system, namely that it could handle very complex real-life problems
that don't match the simplistic idea of what most people imagine a 3D
task must be (ie. pushing polygons).
It was obvious talking to the BP guys afterwards that they were very
proud with what they'd achieved. They said that the model, good though
it was, only used 0.5% of the available image data since the full oil rig
database contained
3.5 trillion triangles. Their system was doing
custom
LOD management to stop the whole demo's complexity racing away.
Though an old example, these concepts in terms of real-world complexity
still apply today.
Defense imaging, GIS, medical, aerospace and other
tasks involve very large datasets (tens to hundreds of GB) and various
kinds of host preprocessing, often on a per-frame basis just like the
oil rig system above. Running these kinds of tasks on gamer hw isn't
remotely viable.
> ... One of the problems in this kind of
> discussion is that those with gaming oriented systems have not
> experienced use of 3D CAD and rendering applications to the level
> where the workstation cards become not only useful, but mandatory.
Exactly! I've talked to lots of people who use professional systems for
an incredibly wide range of tasks. Gamers can no doubt appreciate the
basics of the demands of, say, 3D animation and video editing, but other
tasks in industry are a lot more demanding and often quite unexpected in
their performance behaviour. Everything from controlling textile
knitting/printing machines (datasets as large as 5GB+) to more efficient
cutting of pork carcasses, they all involve complex real-time processing
of 2D/3D data with great precision, where reliability is critical.
Pro-type benchmarks are useful, but they can only be one of many data
points used in the decision making process. In many cases other factors
are more important, especially reliability. I know places still using
SGIs that are 20+ years old because reliability is so important,
especially in the fields of medical imaging, aerospace and industrial
process control. I hear from companies who've been running the same
system for almost 15 years non-stop.
😀 When they finally decide it's
time to replace it with an all-new setup, they struggle to find anything
remotely similar in terms of long-term reliability. Someone at BAe told
me they have to design systems which they will have to maintain for as
long as 50 to 75 years.
> Especially important are the viewports , artifacts and reliability.
Back in about 2000, when SGI really started to lose out to emerging pro
cards for PCs, I had an opportunity to compare an SGI Octane2 V12 to a
P4/2.4 PC with a GF4 Ti4600, running Inventor. The Ti4600 was about five
times faster than the V12; in the research department in which I worked
at the time, this difference was a key factor in their opting for PCs as
upgrades over the existing lab of old SGIs. However, the image quality
of the GF4 was absolutely dreadful. The texture mipmapping was rubbish,
the geometry precision was poor, etc. But the GF4 was massively cheaper,
and that's where industry as a whole was heading - low cost above all
else. At that time, SGI kept opting for ever greater image quality &
fidelity such as 64bit colour, instead of raw performance improvements,
even though many customers especially in film/TV/CAD wanted the latter.
In the end, this mistake killed their gfx business.
Anyway, although consumer cards have come a long way since then, the
same type of quality issues still apply today, though these days most
performance differences are typically related to driver optimisations.
Also, sometimes the obvious choice based on a simple test can lead to
unexpected issues. In 2002, after upgrading from SGI
VW320 systems to P4
PCs with the above Ti4600, the researchers found the PCs to be as much
as one hundred times slower than the old SGI dual-PIII/500 VW320s. The
cause was the way in which the researchers had created their large 3D
models, exploiting the
VW architecture that allows "unlimited" textures
to be used (the urban models they created had a lot of texture,
typically 200MB+, but not that much geometry), and they used a great
many large composite 16K textures as a means of referencing multiple
subtextures in the models. This works fine on a VW320 (since main RAM
and video RAM are the same, so loading a texture means just passing a
pointer), but on a normal PC back then it was VRAM thrash-death,
dropping the frame rate from about 10 to 15Hz on a VW320 to less than
one frame per minute on the PC. So the researchers had to completely
redesign their models: no more composite textures, lots of detail
management, always mindful of the 128MB VRAM limit on the GF4. Only
once this was done did they finally see the big speed improvements
they'd been expecting.
The irony is that SGI's custom architecture - designed to solve certain
3D bottlenecks - ended up at least in this case making the researchers
kinda lazy in how their created their models. Perhaps these types of
experiences are why performance improvements today so often tend to be
from brute-force changes rather than anything really innovative. Note
that many ex-SGI people moved to NVIDIA, eg. when I was testing the O2
for SGI in the late 1990s, the SGI manager I reported to, Ujesh Desai,
is now a senior person at NVIDIA. Later, some of the same people then
moved to ATI and elsewhere.
> After using a Dell Prevision T5400 with the original Quadro FX 580
That reminds me, I have an FX 580 I've not tested yet...
> ... and I could always add a second one in SLI if needed.
Such a shame that so many pro apps can't exploit SLI, and even when
they can, the driver lockouts linked to mbd/chipsets are infuriating.
> instead of x16. The navigation in my large Sketchup models is not
> blazing fast, but it doesn't freeze in Solidworks- in short all
> problems solved. ...
😎
How does the 5500 compare to the 5800? Makes me wonder if even a
used Quadro 4000 would give you a good speedup.
> mentioned, are among the most important aspects of evaluating
> workstation graphics cards and missing in a speed-only focus.
Indeed. Real-world decisions can be messy & rather involved.
> When Quadro K5000's are sold used for $1,000,...
😀
Ian.