I could give a well thought out comment about AI, scraping the web, giving your end users a worse experience in order to copyright their content for your own benefit, or any number of subjects.
I’d imagine they’d be doing it for the whole fediverse. I mean from their perspective, why not? The whole open nature, in my limited understanding, seems to make that easier.
Since multiple Lemmy instances show the same, federated, content, I wonder if our posts will have more weight in the model, for a normal scraper it would be as if many people repeated the same thing over and over on different sites.
I could give a well thought out comment about AI, scraping the web, giving your end users a worse experience in order to copyright their content for your own benefit, or any number of subjects.
However, all I can muster is this:
lol
I swear I read they’re doing it with Lemmy too
I’d imagine they’d be doing it for the whole fediverse. I mean from their perspective, why not? The whole open nature, in my limited understanding, seems to make that easier.
they crawl the entire internet, lemmy is definitely included
Since multiple Lemmy instances show the same, federated, content, I wonder if our posts will have more weight in the model, for a normal scraper it would be as if many people repeated the same thing over and over on different sites.
It would be trivial to remove duplicates. With a little bit of foresight they could just as easily avoid the duplicates in the first place.
They could also avoid to re-crawl the whole internet every day, but here we are, so who knows.
Everything that’s freely accessible on the internet has been scraped 400 times over.