You're treating each word as having the same information content? Clearly, that's not so. A word like "the" contains almost no information, which you can clearly tell by the fact that you can entirely remove them from most cases where they're used and meaning is neither changed nor lost. By contrast, a word like antidisestablishmentarianism is a complex and conceptually-rich term that loaded with meaning and context.
To look at it another way, which is more in line with the article, Claude Shannon defined information as surprise. So, when you can predict the next word, with high confidence, it's unsurprising and thus has low informational content. This was best exemplified in a demonstration Shannon did with some students, way back in 1951. The goal was to estimate the entropy of the English language by attempting to guess the next letter.
The relevance to the finding in this paper is that a human speaker isn't making a decision, word by word, of what to say next. The speaker is making decisions about words, phrases, and even sentences. For instance, if I choose to conclude a short anecdote with the phrase "all is well that ends well", that was essentially one decision I made. Yes, I plucked that phrase from a large space of possible options, but it's not as large as it may seem when you consider I might have only some hundreds of such phrases that I use in my everyday speech.
But, it's not. So, the place to start disputing the claim is by understand their definitions and methodology. It's a shame the paper is pay-walled, but I'm sure more can be gleaned about these details, given I've heard this same finding repeated from other news outlets.