News Security Researcher Finds Coldplay Lyrics in Kingston SSD Firmware

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
in order to have structured data emerge, you need some sort of feedback loop to reverse entropy. Otherwise, entropy would increase, not decrease...
and
The odds of them flipping to the exact sequence required for the song to exist would be 1 in 50! (factorial). This would be 1 in:
30,414,093,201,713,378,043,612,608,166,064,768,844,377,641,568,960,512,000,000,000,000
Several errors here. Implicit in the definition of randomness is the fact that any one bit string is as equally probable as any other: the 50-bit string 011010000...11110" is just as likely to occur as these Coldplay lyrics. It's not until you invoke entropy (the ratio of macrostates to microstates) that the information content becomes relevant.

Which brings us to the calculation above. Assuming 50 bits of information, the denominator shouldn't be 50!, but rather 2^50. (with an information content of -log2[p] ). But that's a minor error compared to forgetting the numerator of the equation -- the total number of macrostates here. It isn't just the "1" possibility of this specific lyrics segment, but rather the number of all possible meaningful states. Would we be equally 'surprised' by finding the lyrics of an Adele or Lady Gaga tune? (Milli Vanilli would of course pass all possible bounds of disbelief) Or a Shakespearean sonnet or the preamble to the Constitution? Yes.

Which brings us to points 3 and 4. Entropy can decrease spontaneously. It's just very unlikely to do so -- in a closed system at least. In an open, dissipative system (which a computer is) entropy decreases continually, by virtue of the power consumed by the system. And in fact, if you believe the theories of Dr. England, entropy always decreases in any dissipative system. Not that I'm disputing that these lyrics didn't appear by simple chance: far from it. Just setting the mathematics here on a firmer basis.
 
  • Like
Reactions: helper800
and

Several errors here. Implicit in the definition of randomness is the fact that any one bit string is as equally probable as any other: the 50-bit string 011010000...11110" is just as likely to occur as these Coldplay lyrics. It's not until you invoke entropy (the ratio of macrostates to microstates) that the information content becomes relevant.

Which brings us to the calculation above. Assuming 50 bits of information, the denominator shouldn't be 50!, but rather 2^50. (with an information content of -log2[p] ). But that's a minor error compared to forgetting the numerator of the equation -- the total number of macrostates here. It isn't just the "1" possibility of this specific lyrics segment, but rather the number of all possible meaningful states. Would we be equally 'surprised' by finding the lyrics of an Adele or Lady Gaga tune? (Milli Vanilli would of course pass all possible bounds of disbelief) Or a Shakespearean sonnet or the preamble to the Constitution? Yes.

Which brings us to points 3 and 4. Entropy can decrease spontaneously. It's just very unlikely to do so -- in a closed system at least. In an open, dissipative system (which a computer is) entropy decreases continually, by virtue of the power consumed by the system. And in fact, if you believe the theories of Dr. England, entropy always decreases in any dissipative system. Not that I'm disputing that these lyrics didn't appear by simple chance: far from it. Just setting the mathematics here on a firmer basis.
Since entropy is a scientific concept and I know minimally about it, I chose not to touch it, however, the mathematical principle of any set of 50 randomly produced variables, I am familiar. Math and odds have nothing to do with relevance. Entropy would change the overall odds of something like this from happening, but the rarity of the positions of the particular set of bits required to exist to create these lyrics is very specific. As an example, most decks of cards come with 52 of them. The positions of all those cards have an equal chance of 1 in 52! to be in any given order. I did not forget the number of states of the question. Look at the question posed in context to the article.
What are the odds that bits randomly appeared to form lyrics to a song?
He said specifically "a" song. This is a singular term meaning any one song. Only the odds of that one unique song should be considered, not the odds of any unique song. So my example of any one song would have a numerator of 1.

If we look at Cold Play, "The Scientist" specifically. It's lyrics have 882 characters by my count. ASCII characters are represented by 8 bits or 1 byte of information data. This means we are dealing with 882 individual bytes of information data. Every byte is represented by 256 unique states that create 128 Unique characters as per ASCII. 882 bytes is equivalent to 7,056 bits. I believe this to mean there are 2^7056 unique sets of these bits. So the odds of any set of 882 bytes required for Cold Play, "The Scientist" to be represented in bits randomly is 1 in 2^7056, which funnily enough is magnitudes larger of a denominator than my 1 in 50! example. 2^7056 is;
1.1685556647213176552927584334227e+2124

The real number with commas is over 2800 characters long, I wont bother posting that here.
 
Last edited:
As an example, most decks of cards come with 52 of them. The positions of all those cards have an equal chance of 1 in 52! to be in any given order. I did not forget the number of states of the question.
Playing cards are distinct; bits are not. For a permutation ordering of cards, the relationship is indeed factorial, but for bit patterns, it is exponential .

# Orderings of 50 playing cards out of a deck of 52 = 52!/2!
# Orderings of 50 2-valued bits = 2^50.

He said specifically "a" song.
Which means, "the likelihood of 'a' song appearing by random chance." Had this been any other song -- or even a different portion of the lyrics of this same song -- it would have been equally surprising. And "surprise" is how we measure information.

If we look at Cold Play, "The Scientist" specifically. It has 882 characters by my count. ASCII characters are represented by 8 bits or 1 byte of information.
Actually, no. ASCII text has (slightly less than) 7 bits per character -- but far less actual information per character, due to redundancy in the language -- that is, in fact, the entire principle behind information compression. It's a complex subject, but the song lyrics, "doobie doobie doobie doobie" contain less information than the (shorter) "on a dark desert highway". Which is why, for simplicity, I was accepting your 50-bit estimate of the actual information content.
 
  • Like
Reactions: helper800
Playing cards are distinct; bits are not. For a permutation ordering of cards, the relationship is indeed factorial, but for bit patterns, it is exponential .

# Orderings of 50 playing cards out of a deck of 52 = 52!/2!
# Orderings of 50 2-valued bits = 2^50.
I realized this, and corrected for it in the second post.
Which means, "the likelihood of 'a' song appearing by random chance." Had this been any other song -- or even a different portion of the lyrics of this same song -- it would have been equally surprising. And "surprise" is how we measure information.
I gave an example song which was the one in the article because I cannot give examples for all songs, such a task would be unachievable. Since I am the one determining the example to give context to the rarity of such an order of bits, I can control the variables such as the lyrics, their length in characters, and so on.

Surprise is how you are determining the information, to say that others are doing the same is either an assumption or at best an inference.
Actually, no. ASCII text has (slightly less than) 7 bits per character -- but far less actual information per character, due to redundancy in the language -- that is, in fact, the entire principle behind information compression. It's a complex subject, but the song lyrics, "doobie doobie doobie doobie" contain less information than the (shorter) "on a dark desert highway". Which is why, for simplicity, I was accepting your 50-bit estimate of the actual information content.
You are correct in that ASCII text has less than a byte of information per character, however, this character cannot be read if it were not a byte, therefore, saying it has less than a byte to refute the logic or math is meaningless.

I admit now that for simplicity's sake I should have just given an example like my second post rather than the first.
 
Last edited:
  • Like
Reactions: Endymio
Several errors here. Implicit in the definition of randomness is the fact that any one bit string is as equally probable as any other: the 50-bit string 011010000...11110" is just as likely to occur as these Coldplay lyrics.
Except it's not 50 bits. By my count, the size of the data block containing their lyrics is 381 characters. Assuming 6 bits per character (probably a little less, but not much less, since they include both cases and some symbols), that's 2286 bits. Written in decimal, that would be a 689-digit number!

I didn't say it's impossible, but so improbable that it's irrelevant, due to being vastly less likely than a great number of cataclysmic events that we don't even think about.

There's a real lesson here, about the nature of information and self-organizing systems. Treating this as a problem of naked probabilities completely misses that point.
 
  • Like
Reactions: helper800
Except it's not 50 bits. By my count, the size of the data block containing their lyrics is 381 characters. Assuming 6 bits per character (probably a little less, but not much less, since they include both cases and some symbols), that's 2286 bits. Written in decimal, that would be a 689-digit number!

I didn't say it's impossible, but so improbable that it's irrelevant, due to being vastly less likely than a great number of cataclysmic events that we don't even think about.

There's a real lesson here, about the nature of information and self-organizing systems. Treating this as a problem of naked probabilities completely misses that point.
It is more likely that one of the ducks that inhabit my backyard will figure out how to start my truck, than for these song lyrics to just appear, in proper sequence, in a completely unrelated block of software.


somethingsomething monkeys and typewriters....
 
I realized this, and corrected for it in the second post....You are correct in that ASCII text has less than a byte of information per character, however, this character cannot be read if it were not a byte, therefore, saying it has less than a byte to refute the logic or math is meaningless.
My point was merely to illustrate the difference between raw combinatorics and information. Recoding the lyrics from ASCII to Unicode would multiply the number of possible bitstrings astronomically... but wouldn't change the information content whatsoever.

Except it's not 50 bits. By my count, the size of the data block containing their lyrics is...2286 bits...
I didn't say it's impossible, but so improbable that it's irrelevant
Certainly. My points were that the statement that "entropy always increases" isn't true, and that the calculations for exactly how improbable this was were off by several thousand orders of magnitude. It's a variant of the 'birthday problem' or the so-called 'coincidence paradox' -- coincidences are far more likely than one believes, because, though the chance of one particular coincidence may be one in trillions, the number of all possible coincidences is extremely large.
 
  • Like
Reactions: helper800