Also the more the internet is swept with AI generated content, the more future datasets will be trained on old AI output rather than on new human input.
Lets see, since the goal is to prevent webscaping all these should work: paywalls, account only acsess, text obferscation (e.g. using a custom font that maps letters randomly to other ones so it looks fine but to a webscraper it looks like gibberish), HTML obferscation (inserting random characters in the HTML then hiding them using CSS) and many more.
Also the more the internet is swept with AI generated content, the more future datasets will be trained on old AI output rather than on new human input.
Humans are also now incentivized to safeguard their intellectual property from AI to keep a competitive advantage.
What are some strategies for doing that? (This is me, totally not a bot)
Paywalls.
Lets see, since the goal is to prevent webscaping all these should work: paywalls, account only acsess, text obferscation (e.g. using a custom font that maps letters randomly to other ones so it looks fine but to a webscraper it looks like gibberish), HTML obferscation (inserting random characters in the HTML then hiding them using CSS) and many more.