Super-Computer Competition!

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
My O/S is Linux From Scratch, kernel 2.4.5, with 512MB RAM. gcc version used was gcc-2.95.3 with Athlon optimization patches (though it shouldn't compile for an Athlon in this case, it should compile for i386).

RAM shouldn't make a difference; this thing should fit easily in <1MB, no matter how large your counter gets. Even if it's computing the series by which it gets the square-root result in software, it doesn't have to hold all elements of the series at one time.

Also, this test doesn't actually spawn off separate child processes. That's what clusters excel at--tasks that end up spawning off many independent threads of execution to do work in parallel (i.e. neural nets and the like). It's likely that the process in its entirety got executed on one node of your cluster.

As for the difference in results, it could easily have to do with how far the system takes the sqrt series iteration. There are formulas to compute how far to iterate a series for some degree of accuracy, but I don't remember them (too long since my last Calculus course :wink: ). In any event, you should expect different CPUs to take the series to different iterations, especially since AFAIK, IEEE hasn't defined a floating-point accuracy standard regarding square root calculations.

Kelledin

bash-2.04$ kill -9 1
init: Just what do you think you're doing, Dave?
 
Those who build SBC rack mount computers/clusters use a passive backplane and installed into 19" rack mounting.

This backplane can have up to 20 PCI slots and 4 processor slots (Im sure there might be more but this is a option available now off the shelf)

SBC's come in too many configurations, multiple processors, gigs of ram, onboard everything including kitchen sink.

Yes I know there are stand alone cases for SBC computers.

No, you cannot drop a SBC into a PCI slot or people would have been doing that for years to upgrade to SMP. closest thing to this was Apple's frankenstien called "dos compatability card" that was a PC inside your old mac.
 
Still haveing problems... I'm not sure how to compile it using the numbers 1 1000000000. When I compile just that code, it runs and says Usage: (path and .exe file name) number1 number2. I'm really rusty with programming, and haven't worked with it for quite some time, so if it is some retarded error of mine I apologize 😛

"Trying is the first step towards failure."
 
Oh, Klipsch has its ProMedia 4.2 set out. Advantages over the 4.1 are dual subwoofers (one front, one rear) and independent control for front and back. Also "re-worked" connection to improve quality. (The last part might well be BS; 4.1's have near-perfect quality as it is.)

Kelledin

bash-2.04$ kill -9 1
init: Just what do you think you're doing, Dave?
 
Ok, Cool on all your mention...But really did it take 34 seconds for your 1.3GHZ Processor to calculate the code!!! Cause if it really did, the Athlon is the way to go for building clusters....I am not showing off here of anything, I just need to learn and I think We all do here....
I have seen the same test run on a 15 node fast cluster and the result was just under 1 minutes... Can you run your test again and repost.
BTW: When I ran the test in a parallel mode I used this to do it running LAM-MPI.
Used this to divide the job:
##########################################
export SIGMASQRT=/home/nabil/sigmasqrt

# $OUTPUT must be a named pipe
mkfifo output
lamexec -c 1 n0 ./sigmasqrt 1 333000000 > $OUTPUT < /dev/null&
lamexec -c 1 n0 ./sigmasqrt 333000001 666000000 > $OUTPUT < /dev/null&
lamexec -c 1 n2 ./sigmasqrt 666000001 1000000000 > $OUTPUT < /dev/null&
####################################
and off course to sum it all, this to go along with it:


##############################
#include <stdio.h>
#include <math.h>

int main (void) {

double result = 0.0;
double number = 0.0;
char string[80];


while (scanf("%s", string) != EOF) {

number = atof(string);
result = result + number;
}

printf("%lf\n", result);

return 0;

}
####################################

This allowed me to run the same job diveded on 3 different CPUs...

Anyway, I am just missing around and don't care about all this....I recently got interesed in High performance computing and just playing around....

BTW: I was thinking to do Linux from scratch and looking for a way to complile the Kernel in an environment other than in another distribution as suggetsted in www.linuxfrom scratch.org. Is that how you did yours !!!!

_______________________________________________~
Software is like Sex, its better when its Free!
 
Well, It's just a basic C code uncompiled....If you are running WIndows then I can't help you, but it really should be similar to What I did in Linux.
The 1 to 1000000000 is not in the compilation process. It is afterward when runnin gthe program as paramaters...
after you compile and let say you come out withn final compiled code then you run it as "test.exe 1 1000000000"


_______________________________________________~
Software is like Sex, its better when its Free!
 
No problem, here it is again:

<pre>[ Kelledin@valhalla ~ ] # gcc -O2 -Wall test.c -o sqrt
test.c: In function `main':
test.c:25: warning: use of `l' length character with `f' type character
[ Kelledin@valhalla ~ ] # time ./sqrt 1 1000000000
21081851083600.382812

real 0m24.984s
user 0m24.330s
sys 0m0.020s
[ Kelledin@valhalla ~ ] #
</pre><p>Here's another thing--the time taken by the x87 FSQRT instruction varies according to what its inputs are. Almost all x87 floating-point instructions are like that--they use early-out optimizations to determine when they've done enough work to produce an accurate result. This means that they're faster overall but produce less consistent profiling results.

I suspect that in your test, the cluster node that had to work with the larger inputs probably spent a significantly higher amount of time than the other two...about how much time does each node take?

Oh, as for compiling a kernel for Linux From Scratch...the important thing is to compile the kernel with an approved version of gcc (currently gcc-2.91.66, also known as egcs-1.1.2). The way I've handled this is to get my hands on that version of gcc and compile it so that it resides under /usr/kgcc--then link /usr/kgcc/include back to /usr/include, and link /usr/bin/kgcc to /usr/kgcc/bin/gcc. Then I went into my kernel source and edited the top-level Makefile to use kgcc instead of gcc (there are two lines to change).

I had to patch gcc 2.91.66 and then hack the source a bit to get it to compile against glibc 2.2.1. If you want, I can send you the source patch and the patching/compiling instructions.

Kelledin

bash-2.04$ kill -9 1
init: Just what do you think you're doing, Dave?
 
Ok, I found it. Just gotta set the executable parameters to 1 1000000000. It took like 5 minutes or so to run, but I get back the answer 21081851083600.559000 on my Pentium II 350. Anyone know how I can time it so it could give me an exact time on how long it took? I'd like to compare it to how fast my Duron 700 runs it, and see what effect overclocking has on both.

"Trying is the first step towards failure."
 
I'm gonna run this one my linux box @ home and post
the results.

the 227.43u is in seconds.. therefore about 3:47

Intel Components, AMD Components... all made in Taiwan!
 
Not meaning to get into an O/S war here...but you might want to get Solaris off those systems and put FreeBSD or Linux on instead. First of all, Solaris (at least v7, a year or two ago) is very expensive to license for commercial use, even just the baseline system. Separate components like the Solaris C/C++ compiler et al cost thousands more to license. Also, Solaris is referred to as <A HREF="http://innominate.org/~tgr/slides/performance/" target="_new">"Slowaris"</A> for a reason...

Please take no offense, it's just a thought.

Kelledin

bash-2.04$ kill -9 1
init: Just what do you think you're doing, Dave?
 
So really what you got is execution time for 1 job running on 1 366MHZ processer, although the system is mutilprocessor but the way you ran the job is serial.
BTW: It still performed better than my 500MHZ same kind processor (3:21)...Probebly yours have more cache I bet. Mine is 64bit runs on 256K

_______________________________________________~
Software is like Sex, its better when its Free!
 
So, you tried it for about a day and decided it sucks? Gee, that opinion carries alot of weight.

As for Linux super computers, how about <A HREF="http://www.cs.sandia.gov/cplant/" target="_new">cplant</A>. #84 on the <A HREF="http://www.top500.org/list/2000/11/" target="_new">Top500</A> list.

The field of computational clusters is pretty much dominated by Linux.

In theory, there is no difference between theory and practice.
In practice, there is.
 
well im sorry i wasnt vary clear on my post actualy i think linux like win2000 unix are great for servers and work stations in there respective applications. but i dont see it as being good for a super computer but if what i was reading in your link is a true super computer then its jsut like a server or workstation array not what i was thinking of when i heard super computer. anyways my actualy use with luinux was short but i do alot of resurch on many things since its my job to do so. so i guess i didnt understand the post. as for desktops i hope you do agree its not the best for that area.

Computer Shop owner and Head tech.
 
i was stating i tryed it at home for a day as a end user box and it sux. thats what i was trying to say. and i wasnt thinking about computer clusters when he said super computer. i dont care what some business thinks about something if it dont work for me its crap. most people get linux for its price FREE! i would rather pay for something that supports at least half the crap i got then d/l free something that can only use 2 games i have. aside from the fact im not compleatly sure he was talking about computer arrays hes saying what will fit in one box so the cluster thing is way off what this post is about in the first place. ill dare you to find one end user using clusters then the "linux running clusters thing will actualy meen somethihng"

Computer Shop owner and Head tech.<P ID="edit"><FONT SIZE=-1><EM>Edited by blade2g on 05/31/01 05:58 AM.</EM></FONT></P>
 
Add an include statement as below to the source
#include <stdlib.h>
sorry replied before reading all the posts...

by the way the original post definitely said one to one thousand million so the statement that the square root was 10000 was wrong. also 10 times too few iterations hence the fast execution.


<P ID="edit"><FONT SIZE=-1><EM>Edited by frikkie on 05/31/01 01:17 PM.</EM></FONT></P>
 
OK, my super computer.
Take a motherboard from an 8 way server. OK will need a custom desktop case, but money is of course no option. Eight of the fastest Xeons (PIIIs) with 2MB cache. 4GB of memory, windows2000 Enterprise addition. 2xSCSI320 controllers, Six drives, three on each controller with Raid 0+1. Yamaha Sound Card with DX7 module. 21inch LCD flat screen. Firewire controller. Graphics Card will be a bit of a problem as server motherboards do not come with AGP ports so may have to use an SGI card in an EISA slot.
That should do it for now.
 
An SGI card in an EISA slot? aaaaack...you're better off with a PCI card! I'm quite sure this 8-way server board has PCI slots...

Oh, and you'd better add something else to your list:

"(1) mickey, to slip to Uncle Moneybags so he doesn't notice how strange your shiny new computer looks." :wink:

Kelledin

"/join #hackerz. See the Web. DoS interesting people."
 
>aside from the fact im not compleatly sure he was talking
>about computer arrays hes saying what will fit in one box
>so the cluster thing is way off what this post is about in
>the first place

This post isn't about supercomputers. It's about mental masturbation on high end workstations/game boxes. A supercomputer won't fit in a desktop box.

>ill dare you to find one end user using clusters then
>the "linux running clusters thing will actualy meen
>somethihng"

So all those vendors building clusters don't have any customers? There are lots of clusters out there, I imagine there must be some "end users" somewhere in the mix. In fact, since you dared me, here is your "one end user using clusters": Me

I have a 16 CPU cluster I use everyday. And my home computers are completely windows free.



In theory, there is no difference between theory and practice.
In practice, there is.
 
A easy simple way to figure out the time on a Unix/Linux machine is to simply put a date command.

[joe@linuxmachine joe]$ date;./numbercruncher 1 1000000000;date

I'm sure there is a better method. I just dont know it. If your using windows then I really cant help ya.



Blah, Blah Blahh, Blahh, blahh blah blahh, blah blah.
 
Intel will win this game no matter what because there are more synthetic benchmarks that show Intel CPUs' strongpoints than there are synthetic benchmarks that show Athlon CPUs' strongpoints. This whole thread is a farce.

-= This is our wading pool.
Stop pissing in it. =-
 
I would say that Intel will win this thread due SMP.....if you are going for a Super computer you don't want to use just 1 CPU do you....

if at first you don't succeed , destroy all evidence that you ever tried...
 
if its SMP you're looking for then you would go
with an Alpha no? But then again.. I've never
seen an Alpha box smaller than 2' wide and 3'
high

Intel Components, AMD Components... all made in Taiwan!
 
I have done many installs at sandia, both new mexico and CA labs. We have Origin 3800/1024 in SNL/CA running 16x turbolinux and fluent.

As far ass biggest teraflop badass in the world, I believe bluegene in san diego still holds the record.

Worlds biggest clusterfuck is in Las Vegas, known as "the freemont experiance" this cluster drives a light matrix in the canopy the size of 2 city blocks. they patched it all together with 640x480 cells. It takes a total of 48 PC's (12x4 not including spares(extreme heat))to drive the matrix and 14 extra machines for messaging, timing, watchdog, backup, and management.

http://www.vegas.com/attractions/off_the_strip/fremontstreet.html

The original investment was over $70 million dollars.

We are replacing this clusterfuck with 1 machine to drive the matrix. new features planned are realtime 3D rendering and compositing.
 
>As far ass biggest teraflop badass in the world, I believe
>bluegene in san diego still holds the record.

Except bluegene doesn't actually exist. It's a research program.

<A HREF="http://www.artificialbrains.com/supercomputers/bluegene.html" target="_new">http://www.artificialbrains.com/supercomputers/bluegene.html</A>


In theory, there is no difference between theory and practice.
In practice, there is.