• pHr34kY@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    do not choose something copyrighted.

    Is that with a “nudge, nudge, wink, wink”? It would be such a shame if the whole project were jeopardised by such things.

  • Lvxferre@mander.xyz
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    A few highlights that I’d like to make about this tool and its usage. Note: on a prescriptive level I’m focusing on moral matters, not legal ones.

    This tool allows you to edit your content. You might have allowed other people and Reddit Inc. to use it, but it’s still yours. And you should be free to do whatever you want with your content, even if it inconveniences others. And people expecting you to give up your moral rights for the sake of their own benefit, frankly, are simply entitled.

    Another user here compared this with vandalism; I don’t think that the comparison is good, given that vandalism targets someone else’s property.

    I also think that people in general are focusing too much on the short-term consequences of the usage of this tool, and too little on the long-term. Here comes some bullet points hell:

    • SEO “improvements” already caught up with the “add «reddit» to search queries!” trick. It’s becoming less effective over time.
    • Reddit is accumulating huge amounts of noise, due to increased bot activity and decreased moderation. It’ll likely get worse over time.
    • Reddit is walling itself off more and more over time. Eventually this info will become unavailable for anyone who didn’t sell their soul to Greedy Pigboy isn’t feeding that cesspool.
    • Every piece of content that you leave in that site is yet another piece of content “inviting” other users to register and stay there, dumping their content into that increasingly walled garden, where it won’t be available publicly. And while they’re free to do so if they so desire (it’s their content), you’re also free to not invite them.
    • There are alternatives to that enshittified platform, competing directly with it. (We’re in one, by the way.) We should encourage people to use those alternatives, not Reddit.

    Are you all getting the picture? You might be tempted to leave your content in Reddit for the sake of other people; even then, the pros of doing so are rather small, and there are cons not often mentioned.

    Regarding LLMs, frankly? I think that it’s mostly a neutral point. Sure, data hoarding bots will get your content from Reddit… but they’ll do it if you post here in the Fediverse, in your blog, or elsewhere. The only alternative to not feeding those bots is to not speak “in the open”.

    • nucleative@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      6 months ago

      Has anyone recently checked the Reddit ToS?

      It’s possible that by clicking that submit button, a perpetual worldwide license was granted that included any purpose Reddit deemed worthy.

      That could actually include every single version of every comment. Your first post, your ninja edit to correct your spellings, your edit update, and finally your plugin’s update that wipes out your comment. All of this could be data Reddit can provide to LLM researchers.

    • fine_sandy_bottom@discuss.tchncs.de
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      I think the most important point is that its competent ineffective for thwarting LLMS. They will be trained using the original data.

      Also, if any significant portion of users nuked their comment history it would be trivial for reddit to block the user and undo the edits.

      • Lvxferre@mander.xyz
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        Also, if any significant portion of users nuked their comment history it would be trivial for reddit to block the user and undo the edits.

        It would be trivial from a procedure standpoint, but not from a social one. It would be really bad reputation for Reddit - “this site doesn’t allow you to remove your content from it”. Problematic specially in Europe.

  • FaceDeer@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    Pointless vandalism. The original comments are already archived, this will accomplish nothing except make Google results even worse for people.

    • starman2112@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      6 months ago

      Exactly my thoughts, and it’s why I haven’t stopped using the site. This doesn’t hurt reddit at all, it only hurts people who want answers to obscure questions. What sucks is that the kind of person who knows what bug causes someone’s Dell Inspiron D630 makes a beeping noise every 23 seconds is exactly the kind of person who’s going to have all of their comments replaced with AI poison.

      • Wiz@midwest.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        it only hurts people who want answers to obscure questions

        Which makes them people less likely to trust Reddit with answers…

        So they go to Reddit less often…

        Which hurts Reddit.

    • southsamurai@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      I think you might want to reconsider getting vandalism as your analogy.

      You can’t vandalize your own property, and any comment or post you make is your words.

      It would be like telling a writer they can’t edit their own work. That’s not vandalism, it’s removing the limited license granted to reddit for your copyrighted material.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        You absolutely can vandalize your own work. People are destroying posts they made that other people find useful, purely out of spite.

        Reddit may allow it, it may be totally legal, but that doesn’t make a difference to my opinion that it’s petty vandalism.

        Go ahead and downvote away, that doesn’t change anything either.

        • southsamurai@sh.itjust.works
          link
          fedilink
          arrow-up
          0
          ·
          6 months ago

          Dude. If it’s yours, anything you do to it is editing. Vandalism defaces something that belongs to someone else.

          I spray paint my tag on my own house, it’s likely ugly as hell, but it ain’t vandalism. You tag my house, that’s vandalism.

          You can have the opinion you want, there’s a shit ton of bad opinions in the world, and we all have at least one ;)

    • Pavidus@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      I’m down for that as well. It’s their info, and they can do with it as they please. I have no right to it, unless they allow it. I totally understand the frustration of not finding the info you want, but I still support the practice.

      It sucks that’s where we are, but WE didn’t steer the ship here. Now we just need to play ball within the confines given to us.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        It’s their info, and they can do with it as they please.

        And one of the things they did with that info was to license it to Reddit, who is now authorized to do what they please with it. No backsies.

          • FaceDeer@fedia.io
            link
            fedilink
            arrow-up
            0
            ·
            6 months ago

            Reddit has the data. AI trainers have the data. Ordinary people Googling for help with their obscure problems get the junk it was overwritten with.

            • sudneo@lemm.ee
              link
              fedilink
              arrow-up
              0
              ·
              6 months ago

              Which is bad in the short termine, but good in the long term, as that means less traffic to Reddit. Ultimately that means that in the long term Reddit will pay the consequences for the actions it has taken.

  • EdibleFriend@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    6 months ago

    Lots of stuff like this already exists and has been proven useless. A guy here on lemmy was a big answer type on some tech support sub. He used one of the account scrubbers to nuke his account before he deleted. Went to look again a few weeks later and all his top comment answers had been restored.

    They haven’t bothered with most people because they simply aren’t useful to making the place look attractive but no mater what you do your comments are stored and will be sold off to the AI companies.

    • WhatAmLemmy@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 months ago

      I believe you can only edit the last 1000 or so comments from your profile. Anything older than that doesn’t display.

    • Th4tGuyII@kbin.social
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      Yeah - this is what I was thinking. We all heard about people being unable to delete comments or Reddit keeping comments even after account deletions back during the first migration, so what stops them holding onto comment history - and what stops them using that to teach llms to discern poisoned data from real data as @pixxelkick said.

    • pixxelkick@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      Yeah in fact you’re giving the llm additional data to train on what poisoned data looks like so it can avoid it better, as they can clear see the before vs after

      • InternetPerson@lemmings.world
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        It is necessary to employ a method which enables the training procedure to distinguish copyrighted material. In the “dumbest” case, some humans will have to label it.

        Just because you’ve edited a comment, doesn’t mean that this can be seen as “oh, this is under copyright now”.

        I don’t say it’s technical impossible. To the contrary, it very much is possible. It’s just more work. This drives the development costs up and can give some form of satisfaction to angered ex-reddit users like me. However, those costs will be peanuts for giants like Google / Alphabet.

    • tehciolo@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      I think you missed the part where you were strongly suggested “not” to use copyrighted text.

      The point is not to get rid of the original text. It’s to “poison” the training data.

  • brygphilomena@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    This only affects scrapers. If reddit is selling the data, they will just sell the unedited version from their database.

    This is ineffective and deleting or editing reddit comments has always been a circle jerk to make yourself feel good that you are “hurting” reddit in some way.

      • Railing5132@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        There was a time where there were many sites on the internet; hundreds, thousands even. And someone could search for content in topics they were interested in and find discussions in forums. I hope the internet becomes that again and sites like reddit burn to the ground, their servers salted to never grow again.

        The world recovered from the burning of Alexandria, and it would recover from the death of reddit. And from the rumbling of their new ad injection schemes, the sooner the better.

    • AliasAKA@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 months ago

      While this is true, I also kind of doubt that Reddit isn’t just one mistake away from accidentally deleting an old db and losing the historical data. So it may in fact mess up their ability to sell the data.

      Also potential GDPR violations etc if you’re in the EU

      • brygphilomena@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        If they were that close, they wouldn’t run a site which solely relies on the safeguarding of that data. I cannot imagine they don’t know how to handle and backup data.

        As for the gdpr, selling the data to an AI company for LLMs is probably anonymized. Or they have a database that does not contain any account information and only the posts. From a cursory read of the gdpr your personal data is your account, not necessarily your posts. If the posts are no longer associated with an account they are free game to reddit.

        Ironically, deleting the accounts might make it easier for reddit to use the data.

  • Makeshift@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    Making info on Reddit useless to real humans is the main reason I need to set aside time to do this.

    I really don’t care if AI trains off of what I’ve said. I do care that greedy greedy Steve Huffman killed 3rd party apps for it.

    If Reddit’s use for searching obscure stuff goes away, there goes the biggest draw of the site. Get people going elsewhere. Like here!

  • InternetPerson@lemmings.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    I think I have about 4000 comments on reddit. I’ve stopped using reddit last year in summer when they pushed their fucking API changes; have been on Lemmy since and never looked back. However, I still have the account, because sometimes I had really nice conversations, which I would like to look up once in a while, or to pick up something which I wanted to keep for another time, like a bookmark basically. I’m also one of the people who sometimes write really really much; walls of text as a product of a lot of effort I put in. It would be sad to see it all go away. Then again, fuck reddirt and it’s management.

    Is there a tool to back up my comments (or also the corresponding threads)? After that I’ll gladly use the tool provided by luddite.