I had some of those same thoughts, bit_user, first about the multiple classifiers, and also about real world applications. But two classifiers can be thought of as one better classifier (except if one says skiers and the other says cat, which one does the black box decide on?). If you have 1 million pictures of cats, and a 99% accurate classifiers, then 10,000 of those pictures will be wrong classified, and this is a fast way to create such pictures. But... it's at the pixel level of control, so I don't see it working on cars (since the angle and distance will change -- it isn't at all clear that a printed picture, i.e. sign, that's categorized wrong from one angle will be categorized wrong at another angle). (The original paper didn't suggest it would.)
Here's the real security point (in my mind, for the amount of time I've had to think about it): if you have a classifier or evaluator (aka "AI"), and it is wrong part of the time, assume an adversary will be able to quickly find examples of input that fail.
And, as you alluded, bit user, painting over a sign already "fools" human drivers.