• 7fb2adfb45bafcc01c80@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    25 days ago

    Again, isn’t that the site’s prerogative?

    I think there should at least be a recognized way to opt-out that archive.org actually follows. For years they told people to put

    User-agent: ia_archiver
    Disallow:
    

    in robots.txt, but they still archived content from those sites. They refuse to publish what IP addresses they pull content down from, but that would be a trivial thing to do. They refuse to use a UserAgent that you can filter on.

    If you want to be a library, be open and honest about it. There’s no need to sneak around.