Went ahead and started running redacted on my old account.
Nothing says we’re just another brick in the wall like writing posts that wind up being used to train a plagiaristic corporate unemployment machine.
What prevents people from training a model with Lemmy’s data?
The next move is to use AI to generate posts and comments
I honestly think that has been happening with all these publications websites.
spez says that’s how he got reddit off the ground in the first place: faking content/engagement (well, genuinely engaging with his account(s?), but essentially shouting into the void and hoping enough people heard and wanted to stick around.
with a RedditUserBot trained on reddit users, you might be able to fake another decade of growth.
Shit move from Reddit. Glad I jumped ship to lemmy.
Honestly, lemmy has less users compared to Reddit, yet you still get more engagement.
The only engagement you actually get is on super-niche subreddits. Other than that, the “engagement” is largely indistinguishable from bot traffic.
I come to Lemmy to read threads of people arguing about whether or not they’re talking to each other at all. This is doing it for me.
You just engaged.
Or admitted to being a bot.
This isn’t Reddit though.
I feel like that comment was edited to be less ambiguous.
I added “on reddit” when I saw people were misunderstanding me.
💍 Will you marry me?
Can you pass a capcha?
Are you implying I can’t pick out bridges or motorcycles? I definitely can, but I won’t do it for you as some kind of sick parlor trick.
Speaking of tricks, did you know there are singles in your area!
Sexy singles- In my area? Are there any weird tricks they don’t want me to know? Just one would probably work.
They can make entire hot dogs disappear! Crazy, right?
One bite at a time, you sickos. Omg, you pervs.
…didn’t have to be the mouth. I’m still impressed.
Your stipud ! (both sic and /s btw) -> there, now you don’t have to go back to Reddit to recall the nostalgia, you are … welcome, I guess?:-D
Ahhh, that’s the stuff. 🤤 Do it again.
Your (sic) WRONG!
About EVRRTYHIGN! (sic)
I may know nothing myself, but I still have an opinion and will share it with you, consent be damned!
Why I… [Reddit cap exceeded, please deposit $10 to continue conversation].
👆this
(Did that do it?)
If gollum and Steve Buscemi had a secret baby
You are glad that you jumped to where AI companies can get the information for free, but are mad at Reddit for getting paid for it.
I can’t make any sense of this.
It’s like the difference between volunteering and being forced to do community service.
In neither case are you forced to do anything so this doesn’t make any sense either.
The difference is that Lemmy admins across the fediverse aren’t making the user experience worse so they can sell the data to corporations for LLM training
So it’s really that the user experience is getting worse. Feeding ai has nothing to do with it.
I’d rather have AI companies have my data for free than reddshit gettong paid for it
First of all, tacos are friends, not food…
Secondly, I think it’s more important what they did to achieve this goal, locking down the API behind a paywall was their way of creating value in their data. They knew then that it would be too expensive for independent developers to pay for but didn’t care. They knew the money would be coming AI data brokers.
Glad I deleted all of my content over there, then.
This may shock you, but it’s not deleted.
Yeah. There was this guy who deleted his account but Reddit restored it. Apparently he was going to take them to court based on some GDPR article.
They still have all the edit history. All editing does is show the last one. The servers would have every version.
this is explicitly illegal under GDPR
That’s not going to stop them.
but you can sue their ass over it
I attempted to delete all my posts using one of those nuke-Reddit scripts and my account got banned for it.
And that’s why I edited+deleted all of mine.
Our collective toilet thoughts are going to fuel the future of robot rhetoric guys
We should have been posting factually incorrect information instead of deleting posts this whole time.
Although I think Reddit does a good job paying factually incorrect information on its own.
FUCK REDDIT! FUCK U/SPEZ! The Red-exit shall endure, VIVA LA LEMMY!!
And FUCK XITTER. Bluesky and Mastodon are waving!
Just because the coffee is free doesn’t mean you have to drink the entire carafe
Yes it does. I’ll get bullet-time superpowers eventually, just watch…
I remember that episode of Futurama
que heart attack
You get nerve damage and seizures before a heart attack unless you have a pre-existing condition.
^Hush, with the facts. The small print requires you to suspend some belief for the jokes to work. Don’t blow it!^
¿Qué?
Queue*
🤗
Cue
That’s the bugger
Hypnotoad getting ready for work
Greedy little pigboy Steve couldn’t resist. Every day they seem to do something that reaffirms leaving was the best plan.
I am willing to bet the most active subreddits that are not too bot infested are the NSFW ones. Reddit AI is going to be creepy and horny.
AI trainers do a lot of work filtering and reformatting the training data. Often that’s the most expensive part. There’s a lot of synthetic data used these days too, reprocessed by other AIs.
i stopped using reddit and deleted my accout and posts when they introduced those fucking nft-avatars and it seems that they’ve been going downhill ever since that.
Those NFT things were just a bad move.
they were headed downhill loooong before NFTs became a thing.
When you delete your account and posts now, unless you edit them first, all deleting them does is hide their visibility in the database. The post is still there.
I don’t mind to give my content for AI training. But with my approval and for free.
You can’t put conditions on it retroactively. You already published.
I am not trying to do that retroactively.
“But with my approval and for free” are new conditions that weren’t present when you originally published it on Reddit.
Yes, but I did not mean retroactively. Nor did I mean only on Reddit, by the way. However, making money from already published content is not what I have consented when I joined Reddit like 15 years ago.
From the current Reddit User Agreement:
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.
I found a historical version from 10 years ago and that version already had this:
you agree that by posting messages, uploading files, inputting data, or engaging in any other form of communication with or through the Website, you grant us a royalty-free, perpetual, non-exclusive, unrestricted, worldwide license to use, reproduce, modify, adapt, translate, enhance, transmit, distribute, publicly perform, display, or sublicense any such communication in any medium (now in existence or hereinafter developed) and for any purpose, including commercial purposes, and to authorize others to do so.
Haven’t dug up anything earlier than this, do you know of any?
Basically, you gave Reddit your approval long ago.
Yep, they changed it.
Did you use the service in the last 10 years?
I’ve just deleted my Reddit account. That’s the last straw for me.
Before they shut down the APIs, I deleted all my posts and edited all my comments.
Spez doesn’t get to profit from me anymore. And hopefully I’m poisoning the well.
Deleting doesn’t actually delete it all. I remember a Reddit user once filed a GDPR for restoring his information after he deleted them.
I don’t miss the dipshits, pun spammers, and smug power mods of reddit at all. I do miss their niche subs and smarter users. Like it or not, they do have some brainy folks peppered among the shit posters.
We have some good folks here, too. Just need more of them.
It’s a shame reddit has been dialing up the shit faucet slowly enough that most of their users don’t notice how awful it is now. They’ve grown accustomed to the poor quality of the content and weaponized greed of the owners.
smug power mods of reddit at all.
Oh they’re here too. They’re not causing too much drama because there’s not enough going on, but they’re here. Some of them are admins of certain instances.
The ones that aren’t here yet will eventually find their way here when Lemmy continues to grow. And the most concerning thing about that is how many more tools Lemmy is providing them to fuck with users.
At least on Reddit, mods couldn’t see votes. Lemmy actually just made it easier for them.
Yeah that’s not good.
Going back to /r/all on reddit now just pure trash. It’s unbelievable how badly it’s declined, very recently.
I wonder how much of it is just bots and karma farmers pretending to talk to each other. It’s really awful.
In all honesty, when I joined Reddit right after digg went to shit. It was amazing. Reddit was great, 3rd party apps were welcome, their interface was straightforward, and they had none of those NFT gold shit.
It just went downhill.
I joined maybe 6 years ago, and there was a bit of shit talking and most posts had a troll answer hitting the most votes for some reason, but it was usually pretty good to scroll straight past and find some really insightful comments. There was a lot of good stuff around reddit, but slowly the absurb number of awards, NFT avatars, reposts, and ads every third post started to corrupt it. It was simple enough to switch to a third party app for quite a while, but the garbage slowly took over.
Even if they hadn’t pulled 3rd party apps, it was getting pretty close a point where it wasn’t worth scrolling past the bullshit.
At that point, they were also open source which was super cool. I always wanted that profile badge you got for submitting a merged PR.
Reddit really went downhill fast after ~2015. I think Lemmy will get there eventually. I remember reddit being a lot smaller back then as well. It took a while to get to the point where niche communities could thrive and I do believe we’ll see that happen here as well (even if it takes a decade or so)
I left Reddit. Had over 600k Karma after a few years answering all kinds of questions from Veteran help to complex engineering.
Fuck Reddit. Will never go back. It’s a shell of what it was only a few years ago.
Glad you’re here with us!
I assume AI is training off the content here for free.
It’s all federated, so it would be strange the bots didn’t scrape anything off.
I was curious if a
robots.txt
equivalent exists for AI training data, and there was some solid points here:If I go to your writing, I read it & learn from it. Your writing influences my future writing. We’ve been okay with this as long as it’s not a blatant forgery.
If a computer goes to your writing, it reads it & learns from it. Your writing influences its future writing. It seems we are not okay with this, even if it isn’t blatant forgery.
[AI at the moment is] different because the company is re-using your material to create a product they are going to sell. I’m not sure if I believe that is so different than a human employee doing the same thing.
https://news.ycombinator.com/item?id=34324208
I still think we should have the ability to opt out like we do with search engines and webcrawlers, but if the algorithm works ideally and learns but does not recycle content, is it truly any different from a factory of workers pumping out clones of popular series on Amazon? I honestly don’t know the answer to that.
This is kinda my take on it. However, the way I see it is that the AI isn’t intelligent enough yet to truly create something original. As such, right now AI is closer to being a tool than a being. Because of that, it somewhat bothers me that I’m being used to teach a tool. If I thought that companies like OpenAI were truly trying to create beings and not tools, then I’d feel differently.
It’s kinda nuanced, but a being can voluntarily determine whether or not something is copyright infringing, understand why that might be an issue, and then decide whether or not to continue writing based on that. A tool can’t really do that. You can try and add filters to a tool to avoid writing copy written text, but that will have flaws and holes in it. A being who understands what it’s writing and what makes it plagiarism vs reference vs homage/inspiration/whatever is less likely to have those issues.
Afaik the OpenAI bot may choose to ignore it? At least that’s what another user claimed it does.
Robots.txt has been always ignored by some bots, it’s just a guideline originally meant to prevent excessive bandwidth usage by search indexing bots and is entirely voluntary.
Archive.org bot for example has completely ignored it since 2017.
The problem is not the technology, the problem is the businesses and the people behind them.
These tools were made with the explicit purpose of taking the content that they did not create, repurposing them, and creating a product. Throw all these conversation about intelligence and learning out the fucking window, what matters is what the thing does, and why it was created to do that thing.
Until we reach a point where there is some sort of AI out there that has any semblance of free will, and can choose not to learn if fed certain information, and choose not to respond to input given to it without being programmed to do not respond, then we are not talking about intelligence, we are talking about a tool. No matter how they dress it up.
Stop arguing about this on their terms, because they’re gaslighting the fuck out of you.
Yes, but there’s no contract to give them legal cover if anyone ever does anything about all the content they steal.
And ya know what? Frankly, if AI is going to harvest all this shit, I’d rather fuckers like spez couldn’t get rich off it in the process. Granted I’m not happy the tech bros running these AI companies are getting rich with these fucking things, but I can at least take solace there isn’t some asshole middle man making bank of the work and words of users they never paid a dime to.
Genuinely, why does Sepz and Reddit deserve to make money off anything we posted? Why does any social media site? They make the site, pay for the servers, maintain the apps, sure, and they can get compensation for that, I don’t see a problem there. But why does any social media company deserve to get rich when the only thing that makes their platform valuable is the people that post to it? Reddit didn’t even have paid mods, the community did all the work on the content of that site, why in the fuck do we tolerate these assholes making profit off it like this?
Intellectual property theft
This is sad to read because I agree with all of it (except the casual sexism).
why in the fuck do we tolerate these assholes making profit off it like this?
Look at this thread. People delete their posts on Reddit. Which means that they can no longer be scraped for free. Which means they are now exclusively available in Reddit’s archive. It’s not that people tolerate it. It’s that the first instinct of people who don’t tolerate it, is to make it worse. What can you do?
What do you mean? What legal cover do they need against what actions?
If the EU (or any other governments) decide that AI can’t legally train their models on information they don’t own or license (I don’t know how that would work legally but they talk about it), then this company that Reddit has sold access to could argue to lawmakers that they have license for all the content on Reddit. I don’t know that it would hold up, but I suspect it’s part of the company’s perceived value in this Reddit deal.