News Full scan of 1 cubic millimeter of brain tissue took 1.4 petabytes of data, equivalent to 14,000 4K movies — Google's AI experts assist researchers

Status
Not open for further replies.
The article said:
A recent attempt to fully map a mere cubic millimeter of a human brain took up 1.4 petabytes of storage just in pictures of the specimen.
...
All of this is to say the human brain is an impossibly dense and very smart piece of art, and the act of mapping it would be both impossibly expensive ...
The key detail you seem to be overlooking is that the pictures are a very inefficient representation of the information content in those neurons and synapses.

You go on to make a flawed assumption that all of the pictures would need to be stored, but this is not so. I'm sure the idea is to form a far more direct representation of the brain's connectivity & other parameters. This could be done on-the-fly, eliminating the need ever to store all of those pictures.
 
The key detail you seem to be overlooking is that the pictures are a very inefficient representation of the information content in those neurons and synapses.

You go on to make a flawed assumption that all of the pictures would need to be stored, but this is not so. I'm sure the idea is to form a far more direct representation of the brain's connectivity & other parameters. This could be done on-the-fly, eliminating the need ever to store all of those pictures.
There is no way they would not store all of the raw data for future analysis. They absolutely would need to be stored long term for future use and reference. No way they wouldn't do that. They could get away with a much less useful dataset by not doing it, but no way they would go through the entire process and throw out anything useful.

Edit: actually this is wrong to some degree, continued in a post below. Without the original data, it can't be checked.
 
Last edited:
There is no way they would not store all of the raw data for future analysis.
They wouldn't if it's too costly. At the very least, you can bet they'd use some advanced compression techniques to exploit the characteristics of volumetric data and wouldn't merely store them as a bunch of PNG files.

They absolutely would need to be stored long term for future use and reference. No way they wouldn't do that.
Also, I'd point out that we're debating what researchers might want to do, rather than the actual information density of those photographs, which was my original point and seems to lie at the heart of what the article was claiming.
 
Last edited:
They wouldn't if it's too costly. At the very least, you can bet they'd use some advanced compression techniques to exploit the characteristics of volumetric data and wouldn't merely store them as a bunch of PNG files.


Also, I'd point out that we're debating what researchers might want to do, rather than the actual information density of those photographs, which was my original point and seem to lie at the heart of what the article was claiming.
Right. There's always shortcuts that work "good enough". Just like compression algorithms removing all that nonsense
 
  • Like
Reactions: bit_user
Size, of course, comes with an asterisk. What takes 1.4PB today took many times that not long ago, and will take many times less that in the not very far future.
Totally agree that the numbers are off. Even today commercially available HDD can reach densities of 20T or more. Hyper scale providers like Google are at not spending anywhere close $0.03/GB. That would make a 20T HDD ~$600 which can be found on Amazon for ~$225. They are getting a much better price at volume. At the Amazon price per GB we are at $0.011/GB and would cost $18B, but we should expect at least a 40% discount at this scale the cost would be ~$10.8B.

Then there is the footprint. Let's assume that the hyper scalers can fit 500 HDDs in a standard 19" 42RU rack (no raid, just pure storage + compute management).
500 HDD * 20TB = 10PB for 19" of liner space or 253k feet of linear space for a single copy (as the article suggests). Data center liner space is some of the most expensive real estate on the planet, but if nothing could be done to increase compression and/or density of HDDs, that is an enormous amount of space at a very low density compared to typical power consumption. This would most likely require specialty designed data centers to support where land is very cheep to make it cost effective.

But to your point, as storage density increases ( the roadmap to 40T isn't too long), reduces cost overtime.
Size, of course, comes with an asterisk. What takes 1.4PB today took many times that not long ago, and will take many times less that in the not very far future.
 
  • Like
Reactions: bit_user
So if we keep peering in deeper and deeper, will we eventually find out of we live in a world, or in a simulation of a world?
Your argument belongs to a religion called Gnosticism.

If we were living in a simulation, there would be a long Inception chain of simulated worlds, one inside of each other, but we should be the first one (where all others are simulated) or the last one, (simulated inside all others), since we cannot yet simulate sentient worlds.

Hence the probability of being simulated is practically zero.
 
...They are getting a much better price at volume. ..
Wrong. That's a common misconception.

You only get discount at volume when the seller has excess capacity.

When the demand is too high, you have to fight against other buyers, by paying more, so prices skyrocket.

Higher demand -> higher prices.

That's why GPUs got so expensive: too much demand after mining and AI.
Also that's why large public works always underestimate their budgets. They calculate costs at present market prices, but as soon as they try to buy the stuff, prices skyrocket.
 
If they could expand these scan/reconstructs to a cubic cm and compare samples from a healthy smart individual to say someone with known neurological conditions, they might be able to glean a lot of helpful information.
For example, they could figure out whether those tightly curled axons is a feature of how we store memories or how we lose them or maybe something totally different. It is well know our brain have redundancy built in, maybe those mirrored neurons are part of that architect. Whatever it is, there’s so much more we could learn as AI technology gets better.
 
Your argument belongs to a religion called Gnosticism.

If we were living in a simulation, there would be a long Inception chain of simulated worlds, one inside of each other, but we should be the first one (where all others are simulated) or the last one, (simulated inside all others), since we cannot yet simulate sentient worlds.

Hence the probability of being simulated is practically zero.
As someone who creates simulators and emulators, I can attest that simulations and emulations are usually designed to fully use the capacity of the hardware they are run on. It could never nest like that. The closest I've ever come to that would be a simulation capable of using parallelism to explore branches simultaneously. If we're one of many simulations, they either execute one after the other or in parallel. And the information content of our universe would have to be a tiny fraction of the universe containing the simulation. So simplified physics in our universe would be a certainty. We simulate universes all the time. The models are extremely simplified versus our own. The fraction of content contained in them compared to our universe would be less than that of an atomic particle to our planet.
 
Wrong. That's a common misconception.

You only get discount at volume when the seller has excess capacity.

When the demand is too high, you have to fight against other buyers, by paying more, so prices skyrocket.

Higher demand -> higher prices.

That's why GPUs got so expensive: too much demand after mining and AI.
The GPU case was particular because there was a single manufacturer.
In the case of storage we have multiple manufacturers and multiple technologies to use.
Stored data are read only so we can use also optical, tapes and so on.
For me with the use case we can achieve a really really low price/GB.
 
  • Like
Reactions: Dntknwitall
Wrong. That's a common misconception.

You only get discount at volume when the seller has excess capacity.

When the demand is too high, you have to fight against other buyers, by paying more, so prices skyrocket.

Higher demand -> higher prices.

That's why GPUs got so expensive: too much demand after mining and AI.
Also that's why large public works always underestimate their budgets. They calculate costs at present market prices, but as soon as they try to buy the stuff, prices skyrocket.
As someone who has actual experience in making purchases for a data center I can tell you that you DO get a discount if you order more. AWS for example pays at most 50% the retail price for a server because they order them by the thousands. The same is true with all the other components in a server.
 
In the case of storage we have multiple manufacturers and multiple technologies to use.

That's irrelevant when you need to buy more than all the manufacturers produce.
As someone who has actual experience in making purchases for a data center I can tell you that you DO get a discount if you order more.
You are just saying that you are too tiny to matter.

You both make the same error of linearization. You wrongly extrapolate small scale, which looks linear, to large scale: "The tiny piece of planet were I move looks flat, so the entire earth is flat".

"When I buy soap at the supermarket the price doesn't change, so if we gift a million dollar to each person, they can go buying stuff without inflation". It's magical socialist thinking.
 
They wouldn't if it's too costly. At the very least, you can bet they'd use some advanced compression techniques to exploit the characteristics of volumetric data and wouldn't merely store them as a bunch of PNG files.


Also, I'd point out that we're debating what researchers might want to do, rather than the actual information density of those photographs, which was my original point and seems to lie at the heart of what the article was claiming.
Sorry but there is no way this science would get green lit with any funding at all if everything wasn't documented and kept documented. This science means nothing without the raw data. It has to exist and be checked by peer review to some degree or the whole thing could be bunk/a mistake.

It's not about what they want to do, it is about what they had to do scientifically to make it viable. You can't use data that you destroyed for real scientific research or that research cannot be checked, and in this case cannot be replicated from the source material. Without that it just isn't science as we currently understand it.
 
You are just saying that you are too tiny to matter.

You both make the same error of linearization. You wrongly extrapolate small scale, which looks linear, to large scale: "The tiny piece of planet were I move looks flat, so the entire earth is flat".

"When I buy soap at the supermarket the price doesn't change, so if we gift a million dollar to each person, they can go buying stuff without inflation". It's magical socialist thinking.
By your argument if AWS need 10k servers that retail for $5,000 and calls up SuperMico and those 10k are the entire stock that SuperMico has that AWS is going to pay full retail price. That is completely wrong. AWS is going to pay $2,500/server on that order and SuperMico will have done several months of sales in 1 day. Both companies will be happy. Again I HAVE done data center purchases. I know how this works.
 
Your argument belongs to a religion called Gnosticism.

If we were living in a simulation, there would be a long Inception chain of simulated worlds, one inside of each other, but we should be the first one (where all others are simulated) or the last one, (simulated inside all others), since we cannot yet simulate sentient worlds.

Hence the probability of being simulated is practically zero.
I believe what you say here. For us to be living in a simulation there would also be many more signs of that. A simulation could never be perfect and would have many flaws. People that believe we are a simulated race, have obviously been brain washed or duped in to believing. This planet still holds secrets we are yet to discover and I believe we will find a type of software storage that is non volatile, meaning it can never be destroyed or corrupt by age or physical damage. It will be the change we will need to store a Brains data and possibly transfer that data to a cybernetic lifeform to be carried on forever.
 
Civility and respect is required by all members. Attack ideas, with information and sources, when possible. DO NOT attack each other. Insults and name calling are not allowed and will be removed.

Thank you.
 
Wrong. That's a common misconception.

You only get discount at volume when the seller has excess capacity.

When the demand is too high, you have to fight against other buyers, by paying more, so prices skyrocket.

Higher demand -> higher prices.

That's why GPUs got so expensive: too much demand after mining and AI.
Also that's why large public works always underestimate their budgets. They calculate costs at present market prices, but as soon as they try to buy the stuff, prices skyrocket.
You are right and wrong with this statement. GPUs skyrocketed in price because of demand but Nvidia/AMD couldn't manufacture them fast enough or make what demand was asking of them, that is why the price skyrocketed. Storage is a bit different they manufacture way more than demand is asking and that is why they can give low prices to companies that use alot and they get contract for those prices.
 
As someone who has actual experience in making purchases for a data center I can tell you that you DO get a discount if you order more. AWS for example pays at most 50% the retail price for a server because they order them by the thousands.
Is that in comparison to the Dell list price, because those machines are crazy-overpriced compared even to what I can build with parts from Newegg! Obviously, Dell servers have some features you won't get from a Supermicro or Tyan server board, but if we're just talking about the cost of the hardware, there are some cushy margins already built into those things.

On the other hand, if you're talking about relative to the "white box" price that it would cost me to order an OCP-spec machine, then a 50% markdown from that is indeed quite impressive!
 
Sorry but there is no way this science would get green lit with any funding at all if everything wasn't documented and kept documented. This science means nothing without the raw data. It has to exist and be checked by peer review to some degree or the whole thing could be bunk/a mistake.
Heh, again you're talking about issues of methodology, whereas the article seems to be making a claim about information density. I'm stepping past the part about methodology (which I already partially addressed in my comment about using better compression, BTW) and focusing on the core claim of information density.

In order to extract the information content of something like a brain, you need to capture at extremely high resolution, even though each image might contain like 1/1000th as much information as its size on disk, if that. The article's authors seem to assume 1:1. Basically, what you need to determine about each neuron, dendrite, and synapse (i.e. in order to accurately simulate it) is a handful of parameters. However, in order to accurately measure those, you need a comparatively large number of pixels.
 
Status
Not open for further replies.