Artists may “poison” AI models before Copyright Office can issue guidance

Key Takeaways:

– President Joe Biden has signed an executive order addressing copyright concerns raised by AI in the United States.
– The US Copyright Office will publish the results of its study on AI concerns and issue recommendations on potential executive actions.
– Artists currently have limited options to protect their works from being used to train AI models.
– OpenAI allows artists to opt out of future training data but not existing data used by AI models.
– Cartoonist and illustrator Sarah Andersen is advancing a copyright infringement claim against Stability AI.
– Andersen must demonstrate specific copyrighted images used to train AI models and show that those models reproduced her art exactly.
– The court will determine if using data to train AI models is fair use of artists’ works.

Ars Technica:

Enlarge / An image OpenAI created using DALL-E 3.

Artists have spent the past year fighting companies that have been training AI image generators—including popular tools like the impressively photorealistic Midjourney or the ultra-sophisticated DALL-E 3—on their original works without consent or compensation. Now, the United States has promised to finally get serious about addressing their copyright concerns raised by AI, President Joe Biden said in his much-anticipated executive order on AI, which was signed this week.

The US Copyright Office had already been seeking public input on AI concerns over the past few months through a comment period ending on November 15. Biden’s executive order has clarified that following this comment period, the Copyright Office will publish the results of its study. And then, within 180 days of that publication—or within 270 days of Biden’s order, “whichever comes later”—the Copyright Office’s director will consult with Biden to “issue recommendations to the President on potential executive actions relating to copyright and AI.”

“The recommendations shall address any copyright and related issues discussed in the United States Copyright Office’s study, including the scope of protection for works produced using AI and the treatment of copyrighted works in AI training,” Biden’s order said.

That means that potentially within the next six to nine months (or longer), artists may have answers to some of their biggest legal questions, including a clearer understanding of how to protect their works from being used to train AI models.

Currently, artists do not have many options to stop AI image makers—which generate images based on user text prompts—from referencing their works. Even companies like OpenAI, which recently started allowing artists to opt out of having works included in AI training data, only allow artists to opt out of future training data. Artists can’t opt out of training data that fuels existing tools because, as OpenAI says:

After AI models have learned from their training data, they no longer have access to the data. The models only retain the concepts that they learned. When someone makes a request of a model, the model generates output based on its understanding of the concepts included in the request. It does not search for or copy content from an existing database.

According to The Atlantic, this opt-out process—which requires artists to submit requests for each artwork and could be too cumbersome for many artists to complete—leaves artists stuck with only the option of protecting new works that “they create from here on out.” It seems like it’s too late to protect any work “already claimed by the machines” in 2023, The Atlantic warned. And this issue clearly affects a lot of people. A spokesperson told The Atlantic that Stability AI alone has fielded “over 160 million opt-out requests in upcoming training.”

Until federal regulators figure out what rights artists ought to retain as AI technologies rapidly advance, at least one artist—cartoonist and illustrator Sarah Andersen—is advancing a direct copyright infringement claim against Stability AI, maker of Stable Diffusion, another remarkable AI image synthesis tool.

Andersen, whose proposed class action could impact all artists, has about a month to amend her complaint to “plausibly plead that defendants’ AI products allow users to create new works by expressly referencing Andersen’s works by name,” if she wants “the inferences” in her complaint “about how and how much of Andersen’s protected content remains in Stable Diffusion or is used by the AI end-products” to “be stronger,” a judge recommended.

In other words, under current copyright laws, Andersen will likely struggle to win her legal battle if she fails to show the court which specific copyrighted images were used to train AI models and demonstrate that those models used those specific images to spit out art that looks exactly like hers. Citing specific examples will matter, one legal expert told TechCrunch, because arguing that AI tools mimic styles likely won’t work—since “style has proven nearly impossible to shield with copyright.”

Andersen’s lawyers told Ars that her case is “complex,” but they remain confident that she can win, possibly because, as other experts told The Atlantic, she might be able to show that “generative-AI programs can retain a startling amount of information about an image in their training data—sometimes enough to reproduce it almost perfectly.” But she could fail if the court decides that using data to train AI models is fair use of artists’ works, a legal question that remains unclear.

Source link

AI Eclipse TLDR:

The United States has acknowledged the concerns raised by artists regarding the unauthorized use of their original works to train AI image generators. President Joe Biden recently signed an executive order addressing the copyright concerns raised by AI. The order states that the US Copyright Office will publish the results of its study on AI concerns, and within six to nine months, the Copyright Office’s director will consult with Biden to provide recommendations on potential executive actions relating to copyright and AI. This offers hope for artists who have been seeking clarity on how to protect their works from being used to train AI models. Currently, artists have limited options to prevent AI image makers from referencing their works, as they can only opt out of future training data. The process of opting out can be cumbersome, and protecting existing works from being used by AI tools is difficult. Artists are left with the option of protecting new works they create going forward. The issue has become widespread, with millions of opt-out requests being submitted. One artist, Sarah Andersen, is advancing a copyright infringement claim against Stability AI, the maker of Stable Diffusion, an AI image synthesis tool. Andersen has been given a month to amend her complaint and demonstrate how her copyrighted images were used to train AI models, and that those models produced art that closely resembles her own. The outcome of her case could have implications for all artists. Until federal regulators establish clear guidelines on the rights artists should retain, the legal battles in this area remain uncertain.