News Nightshade Data Poisoning Tool Could Help Artists Fight AI

Status
Not open for further replies.

setx

Distinguished
Dec 10, 2014
264
237
19,060
Again this nonsense of "imperceptible changes to creators’ images".
I'd bet it either makes images look significantly synthetic or the effect can be completely removed by simple gaussian/median filter (that is transparent for humans/learning given sufficient input resolution).
 
  • Like
Reactions: BX4096

Co BIY

Splendid
The big aggregators (and state actors) already took all the data and have a safe copy stored for later use.

Steal now, decrypt later. Bet they did the same with alla that.

Archived images and text may be crucial for future generative AI to avoid feeding the model not-only poisoned data but AI generated data (AI poisoned data?) which may magnify or concentrate bad effects like in-breeding or biomagnification in a food chain.
 

bit_user

Titan
Ambassador
Again this nonsense of "imperceptible changes to creators’ images".
I'd bet it either makes images look significantly synthetic or the effect can be completely removed by simple gaussian/median filter (that is transparent for humans/learning given sufficient input resolution).
You really ought to read the paper. As usual, the article has a link (which goes to the arxiv.org page and that has a PDF link). I'll just say I was very impressed by not only how thorough the authors were in anticipating all of the applications and possible countermeasures against the technique, but also various scenarios and their consequences.

It's one of those papers you need to spend a bit of time with. They don't give away the goods on the technique until about section 5, so don't think you can just read the abstract, conclusion, and glance at some results.

As impressed as I was with the research, I was even more disturbed by their findings, because it indeed seems truly imperceptible and very destructive to models - especially when multiple such attacks are independently mounted. Not only that, but it seems exceptionally difficult to defend against. Perhaps someone will find an effective countermeasure, but I came away with the impression that an analogous attack against spy satellites would be to fill low-earth orbit with a massive amount of debris. That's because it's both indiscriminate and, like a good poison, you don't need very much of it.

What it's not is a protection mechanism like Glaze:

They do outline a scenario in which a company might use it to protect their copyright characters, but you couldn't use it to protect individual artworks.
 

bit_user

Titan
Ambassador
The big aggregators (and state actors) already took all the data and have a safe copy stored for later use.

Steal now, decrypt later. Bet they did the same with alla that.

Archived images and text may be crucial for future generative AI to avoid feeding the model not-only poisoned data but AI generated data (AI poisoned data?) which may magnify or concentrate bad effects like in-breeding or biomagnification in a food chain.
Eh, yes and no. They reference one opensource data set, called LAION-Aesthetic, which:
"is a subset of LAION-5B, and contains 600 million text/image pairs and 22833 unique, valid English words across all text prompts"​

So, you can even access such datasets if you're not one of the big guys. They also reference other datasets.

The problem with not taking in any new data is that your models never advance beyond 2022, or whenever generative imagery started being published en masse. Your models would never learn how to generate images of people, places, or things newer than that. Still, I'm with you on using archived and sequestered data for older imagery, so as to minimize exposure to such exploits. For everything else, establishing chain-of-custody will be key.
 
Status
Not open for further replies.