MIT CSAIL Researchers Show How Vulnerable AI Is To Hacking

Status
Not open for further replies.
Though I tend toward the positive side of advancing AI, my dark side is thinking, "Whhhhhoa ..." We don't know what we don't know.

The havoc wreaked by certain levels of insidious hackery are near unimaginable and inconceivable. It's like we need to advance secure 'good' AI to protect us from the evolution of really nasty 'bad' AI. The potential damage is boundless.

On the road these days I'd trust an autonomous vehicle more than most drivers. It is not difficult to envision automobile hackery -- even that of tens of thousands in a single 'global incident.' The problems I have are with the truly disgusting demented sicko's we can't imagine, with their monumental, societal long-term douchery.

Example: CSAIL researchers are making wonderful advances using AI in the diagnosis if early cancers. Combined with enhanced imagery techniques these advances are of incredible benefit to us all ...

but if the AI is vulnerable and not highly secure, some sick, aggrieved whack-job could 'worm' their way in --- using nasty AI to spoof medicos into thinking third-stage malignancies are benign fatty tissues. Literally, years down the road you have thousands dying from cancer-ridden bodies.

It just seems to me that secure must be conjoined with AI as a global priority (s-AI ?) --- or we are preordained to 'Hacker with Mental Illness Crashes Subway Trains In Catastrophic Disaster' headlines.

The industry must learn to police themselves and highly secure their AI, because our fearless politicos are simply too dumb or indifferent (or bought off...).


 

bit_user

Splendid
Ambassador
I wonder how difficult it would be to fool two parallel, independent classifiers.

I do think the real-world relevance of this is somewhat limited, but it's definitely a tool that can be used for mischief, harm, and theft.

One consequence I find rather troubling is that, given this exploit requires access to the classifier, some will undoubtedly be tempted to seek security via obscurity by simply trying to lock up their models and APIs.

BTW, let's not forget there are some rather famous optical illusions that hack the human visual system to fool us into seeing vastly different interpretations.
 

michaelzehr

Distinguished
Sep 18, 2008
60
0
18,640
1
I had some of those same thoughts, bit_user, first about the multiple classifiers, and also about real world applications. But two classifiers can be thought of as one better classifier (except if one says skiers and the other says cat, which one does the black box decide on?). If you have 1 million pictures of cats, and a 99% accurate classifiers, then 10,000 of those pictures will be wrong classified, and this is a fast way to create such pictures. But... it's at the pixel level of control, so I don't see it working on cars (since the angle and distance will change -- it isn't at all clear that a printed picture, i.e. sign, that's categorized wrong from one angle will be categorized wrong at another angle). (The original paper didn't suggest it would.)

Here's the real security point (in my mind, for the amount of time I've had to think about it): if you have a classifier or evaluator (aka "AI"), and it is wrong part of the time, assume an adversary will be able to quickly find examples of input that fail.

And, as you alluded, bit user, painting over a sign already "fools" human drivers.
 

bit_user

Splendid
Ambassador

If the two classifiers use different architectures or operational principles, then the paper's method might have difficulty converging. Neural networks are designed to have a relatively smooth error surface to facilitate training. But the joint error surface from two different kinds of classifiers might have lots of discontinuities.

GANs are another. In fact, I wonder if there's some relationship.

https://en.wikipedia.org/wiki/Generative_adversarial_network


Exactly my thoughts - this is constructing a still image, whereas most real-world applications involve video and objects moving in 3D space. The false image is probably pretty fragile (although did you see the article about the 3D printed turtle?). Generally speaking, it would be impractical to spoof self-driving cars' classifiers without putting things in the environment (i.e. on or near roadways) that look obviously out-of-place.

Furthermore, I expect 3D self-driving cars will use depth to help them ignore things like billboards and advertisements affixed to other vehicles. With depth information, they'll also be much more difficult to fool.


Humans are wrong in so many cases that it hardly even registers. I can't even count the number of times I've had to do a double-take after misinterpreting some shapes in a darkened room. Or maybe some bushes give the impression that a figure is standing among them. Heck, people see shapes in the clouds, a man in the moon, and various faces and likenesses in everything from water stains to pieces of burnt toast and misshapen Cheetos.
 

Olle P

Distinguished
Apr 7, 2010
602
30
19,040
23
This won't work in real life.
What they did was to first have the original picture correctly interpreted as a dog. Then they not only changed the pixel, but also the AI's "knowledge", by making small changes to the picture and having the AI interpret every single iteration!

The AI would not have interpreted the final image as "dog" if it was shown that one alone, or just after the original image.
 

bit_user

Splendid
Ambassador

It did require access to be able to run the classifier and inspect the output, but they didn't change the classifier as you seem to imply.

The real-world significance is that if someone builds a product using a standard classifier, like this one from Google, then other people can use the method to create images that spoof the classifier without having access to its source code or weights. It does mean they either have to figure out which classifier it's using, or at least find some way to directly access whatever classifier it's using.

I agree that it's of limited usefulness - mostly for spoofing web services, IMO. Not a big threat for self-driving cars or video surveillance cameras, for instance.
 

Olle P

Distinguished
Apr 7, 2010
602
30
19,040
23
It's a learning AI and they tought it, in many small steps, that the final image was indeed an image of a dog.

Another identical AI that has (only) the same original learning experience and show it the final image directly it won't interpret it as a dog because it hasn't "seen" the transformation.
 

bit_user

Splendid
Ambassador

Their method doesn't change the classifier - it modifies one image to classify the same as another image.

Put another way: given an immutable classifier and two images, A and B, find a small set of changes to image B that make it classify as A (and hopefully still looks to humans like B).

The article contains a link to the paper, in the first sentence. I'm basing my comments on the actual paper - not simply what I read in the article.
 
Status
Not open for further replies.

ASK THE COMMUNITY

TRENDING THREADS