It doesn't. It degrades the total capacity.m surprised that the MTBF of those systems is noted in "hours"
I mean, one can reasonably expect some redondancy on that scale ?
a failling GPU or CPU should not put the whole computer to an halt
It doesn't. It degrades the total capacity.m surprised that the MTBF of those systems is noted in "hours"
I mean, one can reasonably expect some redondancy on that scale ?
a failling GPU or CPU should not put the whole computer to an halt
Yeah, like if the checkpointing were granular enough, then you'd expect the same work item could just be submitted for another node to process.m surprised that the MTBF of those systems is noted in "hours"
I mean, one can reasonably expect some redondancy on that scale ?
a failling GPU or CPU should not put the whole computer to an halt
I think you're perhaps a little too cynical. What this looks like to me is just an article too hastily published. The author should have done more digging to find out the impact of the failures, which we're left to speculate about. Do they really take entire jobs offline? Or can the fault-tolerance model handle them without any major hiccups?“You are going to have failures at this scale. Mean time between failure on a system this size is hours, it’s not days.”
You have to love it when a journalists plays like he didn't get the meaning of the statement, it is obviously means with more complexity there is more needed maintenance, "the machine that uses 60 million parts in total" and he didn't blame AMD hardware, he exactly said: "I don’t think that at this point that we have a lot of concern over the AMD products.” so go ahead and complete your "sorry I'm killing this business reputation by mistake- I didn't mean to" It is so obvious what he said and what he meant but, and it's obvious what you are doing.. you are not just a kid.. what a corrupted world we live in.
True but just as I should or less, when he writes a quote then to interpret it as if it mean the exact opposite then what is it? unprofessional or unfair articles that can lead to a lose - lose case for the consumer - seller, are unethical, he was quoting the program director yet he acted like he just didn't hear him, he is the one who was acting too cynical but sadly not in point, to an invented point to hide then comments that exposed what happened, it wasn't personal, and the truth to be told: the internet is full of these low quality writing and the reader enlightenment is not a priority anymoreI think you're perhaps a little too cynical. What this looks like to me is just an article too hastily published. The author should have done more digging to find out the impact of the failures, which we're left to speculate about. Do they really take entire jobs offline? Or can the fault-tolerance model handle them without any major hiccups?
You don't have to imagine:
News - US Government's Aurora Supercomputer Delayed Due to Intel’s 7nm Setback
The Department of Energy confirmed the delay of the Aurora supercomputer. US Government's Aurora Supercomputer Delayed Due to Intel’s 7nm Setback : Read moreforums.tomshardware.com