New corruption tool spells trouble for AI text-to-image tech

Key Takeaways:

– Generative AI tools, such as OpenAI’s ChatGPT, are trained on large amounts of scraped data from the web.
– These tools can generate images from text prompts, but the images used for training are often scraped from the web without consent or compensation from artists and photographers.
– A team of researchers has developed a tool called Nightshade that can confuse the training model by adding invisible pixels to artwork before it’s uploaded to the web.
– Nightshade “poisons” the training data, potentially rendering the outputs of image-generating AI models useless.
– Nightshade could help artists regain control over their copyright and intellectual property.
– The team plans to release Nightshade as an open-source tool for others to refine and improve.
– OpenAI has faced lawsuits from artists for using their work without permission, and has recently allowed artists to remove their work from its training data, although the process is burdensome.
– Easier removal processes could discourage artists from using tools like Nightshade, which could create more issues for AI firms in the long run.

Digital Trends:

Professional artists and photographers annoyed at generative-AI firms using their work to train their technology may soon have an effective way to respond that doesn’t involve going to the courts.

Generative AI burst onto the scene with the launch of of OpenAI’s ChatGPT chatbot almost a year ago. The tool is extremely adept at conversing in a very natural, human-like way, but to gain that ability it had to be trained on masses of data scraped from the web.

Similar generative-AI tools are also capable of producing images from text prompts, but like ChatGPT, they’re trained by scraping images published on the web.

It means artists and photographers are having their work used — without consent or compensation — by tech firms to build out their generative-AI tools.

To fight this, a team of researchers has developed a tool called Nightshade that’s capable of confusing the training model, causing it to spit out erroneous images in response to prompts.

Outlined recently in an article by MIT Technology Review, Nightshade “poisons” the training data by adding invisible pixels to a piece of art before it’s uploaded to the web.

“Using it to ‘poison’ this training data could damage future iterations of image-generating AI models, such as DALL-E, Midjourney, and Stable Diffusion, by rendering some of their outputs useless — dogs become cats, cars become cows, and so forth,” MIT’s report said, adding that the research behind Nightshade has been submitted for peer review.

While the image-generating tools are already impressive and are continuing to improve, the way they’re trained has proved controversial, with many of the tools’ creators currently facing lawsuits from artists claiming that their work has been used without permission or payment.

University of Chicago professor Ben Zhao, who led the research team behind Nightshade, said that such a tool could help shift the balance of power back to artists, firing a warning shot to tech firms that ignore copyright and intellectual property.

“The data sets for large AI models can consist of billions of images, so the more poisoned images can be scraped into the model, the more damage the technique will cause,” MIT Technology Review said in its report.

When it releases Nightshade, the team is planning to make it open source so that others can refine it and make it more effective.

Aware of its potential to disrupt, the team behind Nightshade said it should be used as “a last defense for content creators against web scrapers” that disrespect their rights.

In a bid to deal with the issue, DALL-E creator OpenAI recently began allowing artists to remove their work from its training data, but the process has been described as extremely onerous as it requires the artist to send a copy of every single image they want removed, together with a description of that image, with each request requiring its own application.

Making the removal process considerably easier might go some way to discouraging artists from opting to use a tool like Nightshade, which could cause many more issues for OpenAI and others in the long run.

Editors’ Recommendations






Source link

AI Eclipse TLDR:

Professional artists and photographers are increasingly frustrated with generative AI firms that use their work without permission or compensation to train their technology. To address this issue, a team of researchers has developed a tool called Nightshade, which can confuse the training model used by these firms. Nightshade “poisons” the training data by adding invisible pixels to an image before it’s uploaded to the web, causing the AI model to produce erroneous images in response to prompts. This tool could potentially damage the outputs of image-generating AI models and serve as a warning to tech firms that ignore copyright and intellectual property. The team behind Nightshade plans to release it as an open-source tool so that others can refine and improve it. OpenAI, the creator of DALL-E, a popular generative AI model, has recently allowed artists to remove their work from its training data, but the removal process has been described as burdensome. Simplifying the removal process could discourage artists from resorting to tools like Nightshade, which could lead to more problems for AI firms in the long run.