My original, editorialized title: Ars Technica Sells Out
Linking to this because I know people here read Ars Technica, and I totally didn’t become a subscriber three days before this was announced. Nope. No sir.
This is the logical endpoint for all the people who were complaining that scraping the open web for training is somehow immoral/illegal. Instead of stopping AI those with deep pockets will continue to train on everything while open source and small company efforts will be locked out.
Useful AI will be focused and narrow unless they actually achieve AGI.
Scraping literally the whole internet for inspiration is part of the reason they come up with utter rubbish. No one’s actually scrutinizing what their ingesting. It’s not so much a problem that they violate copyright it’s more an issue that because they do it in this manner their output is garbage.
If these AI companies actually did some content curation we might get decent AI out of it.
Damnit! I still like and respect the Ars Technica staff but Condé Nast can piss off.
I feel for you KingThrillgore. I was thinking of supporting the site with a subscription but not after this. Still if enough people stop subscribing we may loose them altogether. This is a double edged shit sword.
I made it clear in my comment I was not happy about this after I became a subscriber. It will not auto-renew. If I had done it with a credit card and not Paypal, I’d try for a chargeback.
For what its worth, nobody else active on the site is happy about this, either. Lots of unsubscribes are being claimed in the comments (including mine).
Understood and have seen the comments. I don’t blame you for being upset. It’s a crappy situation.
Fuck those shitheads.
The main problem now is that Ars Technica and all other Conde Nast publications, with it now having a vested interest in openAI (they’re getting paid by them), can no longer be reliably trusted to report on any AI or AI-adjacent topic whatsoever. And every user comment and content is now owned by openAI.
I’m not so sure Ars has a vested interest in OpenAI. I actually read through 10 pages of comments. Ken Fisher was pretty active in them, and noted several times that Ars doesn’t see any of the money from this deal.
Ars does not. Their chief editor has said as much. But Conde Nast absolutely does, and it WILL happen where CN will start telling Ars what they can and can’t do, because that’s how corporate ownership works. It happens every single time without fail, and that is why you can no longer trust anything they post related to AI. You toe the line to capital’s interest, or CN is gonna eject you from the org.
I bet MFCBot would work super well if it was updated to report on if the publication’s owners had a vested interest in the topic.
Oh it would. It would require the MFC site itself to actually collect and collate data on site’s parent company investment portfolios and that’s a pretty massive ask from what sounds like a very tiny team.
I wonder if it could somehow pull that data from somewhere like ground news somehow
what’s the problem here? openai isn’t pirating content if they pay for it,? have i misunderstood something
ai new
new bad
remember old time
old time good
Their EIC is in the comments replying to people. He seems buttoned up about it (understandably) but I get the impression he’s not thrilled
Definitely not thrilled, and he says in the comments that ars does not benefit financially from this.
I want Ars content to be part of whatever training data is provided to the best models. How does that get done without appearing like they are being bought?
Even if their contract explicitly states that it is a data sharing agreement only and the products of the media organization (articles/investigations) are not grounds for breach or retaliation, it is assumed that there is now some impartiality in future reporting.
So, for all media companies, the options seem to be:
- Contribute to the greater good by openly permitting site scraping (for $0)
- Allow data sharing to contracted parties only (for a fee)
- Public or privately prohibit use of any data, and then seek damages down the road for theft/copyright infringement when the legal framework has been established.
Is there a GPL or other license structure that permits data sharing for LLM training in a way that it does not get transformed into something evil?
I understand them. If they refused for integrity reasons, openai would steal their content anyways via scrapers.
Suing them for copyright infringement, even if is the desire we all have, is ultra expensive.
I would also have signed that deal with the devil…
Either you sell your soul for something, or get it taken from you, leaving you with nothing.
I would have signed it too, but only because I’m awesome on fiddle and I could totally win my soul back.
llms suck because they steal content and are unreliable since they don’t link back to sources
Open ai makes a deal to pay media org for there content and makes it so they can link back to original article
“Ars technica sold out”
It’s funny because you’re making the opposite point of the one you think you’re making. Cause if you put together the two pieces of information from your comment, the entire picture is:
Open ai makes a deal to pay media org for there content and makes it so they can link back to original article, with the money they make from stealing everybody else’s content
That’s already pretty bad, even without that points you neglected to mention, like how some of the content that is indirectly making money for Ars Technica is stolen from their competitors, or how Ars Technica basically became a worthless journalistic source for AI at a time where public opinion is not yet settled on its morality and precedent has not been set on its legality. How is this not “sold out” to you?
Condé Nast didn’t just sell access to their subsidiaries’ content, but also to the user generated content on those subsidiaries’ sites. That’s at issue here.
It also has a possibility to cause a conflict of interest for Ars Technica to write about OpenAI. That’s the second issue here.
And, as per the editor in chief, the money doesn’t go to Ars Technica, but to Condé Nast.
Yes they sold access to the user content we’ve generated after we explicitly agreed to the fact that they may do so. If you’ve chosen to not read the fine print when you created an account and created content for them, that’s sort of up to you tbh.
Never trust Condé Nast to do the right by its consumers. That’s a tale decades old at this point.
I am going to stop reading them myself.
It’s not so much ideological than practical, I simply don’t have the time to fact check them, or figure out which are the real articles and which are the AI ones, etc etc
That’s not what’s happening. The AI is ingesting the human made content (articles and comments). It isn’t writing any of the content of the site. I’m just going to cancel my subscription if they don’t give me a means to opt out of my comments feeding ChatGPT
Thanks for describing it, I misread the first sentence.
Still, it’s AI creeping into their news. Even if it does change the content now, first step leads to a second step later. And that may not be noticed by me
Best to just train myself to not use them now