Blocking AI bots from Microsoft, others has been “pain in the a**”: Reddit CEO | Huffman says companies must pay to scrape Reddit data even though Reddit itself relies on free, user-generated content

ForgottenFlux@lemmy.world · edit-2 3 months ago

Blocking AI bots from Microsoft, others has been “pain in the a**”: Reddit CEO | Huffman says companies must pay to scrape Reddit data even though Reddit itself relies on free, user-generated content

dual_sport_dork 🐧🗡️@lemmy.world · 3 months ago

Everyone always says this like it’s some kind of gotcha, but all of my nuked posts still have my “fuck you, reddit” content and haven’t been reverted. It’s been nearly exactly a year.

Maybe reddit has an offline copy of my old content and that of others somewhere, but if so they’d be handing that directly over to whoever under some kind of agreement – that certainly wouldn’t be the subject of any kind of site crawling which is the crux of the issue here.

stevedidwhat_infosec@infosec.pub · 3 months ago

You’re ignoring the idea that they could still be working on a way to restore content and haven’t completed that process yet

Or that they could start feeding your archived (not cached) data directly to the AI companies anyway for a price

IMO, you can win by jamming your “transmissions” with noise. It’s easier to hide in noise as noise than it Is to be silent IMO. Muddy the waters as it were

finley@lemm.ee · 3 months ago

You’re ignoring the idea that they could still be working on a way to restore content and haven’t completed that process yet

there’s no evidence to suggest this, though.

stevedidwhat_infosec@infosec.pub · 3 months ago

Content is absolutely archived and they have financial incentive to restore the quality of their “knowledge base”

That’s a fair amount of circumstance and motivation to support my idea, regardless of tangible evidence

finley@lemm.ee · 3 months ago

Motivation and circumstance, absent actual evidence, does not make for a convincing argument.

stevedidwhat_infosec@infosec.pub · 3 months ago

Alright well I guess evidence is needed before we can have ideas - crazy

finley@lemm.ee · 3 months ago

No, it just means that they are no more than ideas at this point

stevedidwhat_infosec@infosec.pub · edit-2 3 months ago

Right, which means it can be fairly considered when discussing the real crux of the issue with AI and big tech companies right now, which is the monetization of other peoples content.

If we’re discussing this, we should be looking at whether or not companies are doing this, given they have motive and specific, relevant circumstance to enact such behavior.

Lack of evidence means you need to investigate for said evidence. It does not mean you should not investigate. Privacy advocates, members of any org/cert with an ethics statement should be blowing the whistle on any kind of activity that would mean a users data is not being deleted upon their request, especially considering reddits global usage.

finley@lemm.ee · edit-2 3 months ago

By all means investigate, I’m just saying there has yet to be presented any actual evidence. I look forward to seeing whatever you may discover.

iAmTheTot@sh.itjust.works · 3 months ago

I certainly wasn’t implying that they were going to revert your comments.

Womble@lemmy.world · 3 months ago

it never was deleted, all that happened is that an extra line was added to a database that said “comment 65432426542654 now should be displayed as “fuck you, reddit” rather than the original text”. The original post is still in an earlier row available to reddit, it just isnt being displayed on their web page.