Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments

Blaze@lemmy.blahaj.zone · 7 months ago

Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments

FaceDeer@fedia.io · 6 months ago

If the AI trainers have the original text then “poisoning” the live site’s content isn’t going to do anything at all.

You can’t touch the original text. It’s already been archived.

tehciolo@lemm.ee · 6 months ago

If they scrape the updated comments again and ingest copyrighted text, you are poisoning the data.

FaceDeer@fedia.io · 6 months ago

That’s my point. They won’t.

And even if they did, it’s unclear that copyright has anything to say about AI training anyway.

InternetPerson@lemmings.world · 6 months ago

NYT is currently suing because of copyright infringiments.

https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html

it’s unclear that copyright has anything to say about AI training anyway

Although lawmakers worldwide have slept while AI advanced and therefore missed to make some important laws, they are catching up. Europe recently passed its first AI act. As far as I’ve seen it also states that companies must disclose a detailed summary of their training data.

https://www.ml6.eu/blogpost/ai-models-compliance-eu-ai-act

Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments

Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments

The Luddite