The problem:
The web has obviously reached a high level of #enshitification. Paywalls, exclusive walled gardens, #Cloudflare, popups, CAPTCHAs, tor-blockades, dark patterns (esp. w/cookies), javascript that makes the website an app (not a doc), etc.
Status quo solution (failure):
#Lemmy & the #threadiverse were designed to inherently trust humans to only post links to non-shit websites, and to only upvote content that has no links or links to non-shit venues.
It’s not working. The social approach is a systemic failure.
The fix:
-
stage 1 (metrics collection): There needs to be shitification metrics for every link. Readers should be able to click a “this link is shit” button on a per-link basis & there should be tick boxes to indicate the particular variety of shit that it is.
-
stage 2 (metrics usage): If many links with the same hostname show a pattern of matching enshitification factors, the Lemmy server should automatically tag all those links with a warning of some kind (e.g. ⚠, 💩, 🌩).
-
stage 3 (inclusive alternative): A replacement link to a mirror is offered. E.g. youtube → (non-CF’d invidious instance), cloudflare → archive.org, medium.com → (random scribe.rip instance), etc.
-
stage 4 (onsite archive): good samaritans and over-achievers should have the option to provide the full text for a given link so others can read the article without even fighting the site.
-
stage 5 (search reranking): whenever a human post a link and talks about it, search crawlers notice and give that site a high ranking. This is why search results have gotten lousy – because the social approach has failed. Humans will post bad links. So links with a high enshitification score need to be obfuscated in some way (e.g. dots become asterisks) so search crawlers don’t overrate them going forward.
This needs to be recognized as a #LemmyBug.
So, first off, I love everything you have here.
The only thing. Onsite archive. I’d love it, but I wouldn’t want copyright law used to punish the Lemmy community. I don’t think I’m quite qualified to answer this question, so I’ll ask it here: how worried should we be about that?
It would need some analysis by legal experts. But consider that archive.org gets away with it. Although archive.org has an opt-out mechanism. So perhaps each Lemmy instance should have an opt-out mechanism, which should push a CAPTCHA in perhaps one of few good uses for CAPTCHAs. Then if Quora wants to opt-out, they have to visit every Lemmy instance, complete the opt-out form, and solve the CAPTCHA. Muahaha!
Note as well how 12ft.io works: it serves you Google’s cache of a site (which is actually what the search index uses). How did Google get a right to keep those caches?
There’s also the #fairUse doctrine. You can quote a work if your commenting on it. Which is what we do in the threadiverse. Though not always – so perhaps the caching should be restricted to threads that have comments.
Archive.org doesn’t really “get away with it.” They face frequent lawsuits and have a steady stream of donations to fight them, along with enough staff to handle responding to takedown demands etc. That isn’t true of most Lemmy instances.
Just like Greenpeace paves the way for smaller activist groups that can’t stand up to challenges, archive.org would serve in the same way. When archive.org (with ALA backing) wins a case, that’s a win for everyone who would do the same. Lemmy would obviously stay behind on the path archive.org paves and not try to lead.
I mean, does archive.org get away with it, though?
They have legal troubles not infrequently and they’ve lost at least one copyright case that I know of recently.
I doubt if you pooled all the Lemmy instances’ resources that they’d have the resources to fight a copyright case.
And do I really have to spell out how Google gets away with caching stuff?
Finally, “fair use” isn’t magic words that magically absolve you of any liability in all copyright claims. I’m extremely skeptical fair use could be twisted to our defense in this particular case.
They get blocked by some sites, and some sites have pro-actively opt-out. archive.org respects the opt-outs. AFAICT, archive.org gets away w/archiving non-optout cases where their bot was permitted.
You might need to explain why 12ft.io gets away with sharing google’s cache, as Lemmy could theoretically operate the same way.
When you say “twisted”, do you mean commentary is not a standard accepted and well-known fair use scenario?
Archive.org is more than The Wayback Machine. You’re just talking about The Wayback Machine, not archive.org as a whole. Nothing I’ve said in this thread is about The Wayback Machine specifically.
My point is that archive.org does things that bend, skirt, and run afoul of copyright law (and good on them because fuck the system) and they spend more money, time, and resources fighting copyright suits than I’d imagine all Lemmy instance owners pooling their resources could afford. And that’s if they even cared enough to risk dying on that hill.
Not sure how this bit is relevant. I was speaking only about your “stage 4 (onsite archive)” item. (I thought that was pretty clear, but apparently not?) I don’t know if 12ft.io is playing with (legal) fire or not, but I’m not sure why it matters to the conversation. Nothing 12ft.io does is comparable to Lemmy users copying articles into comments.
So, I’m only going to be talking about U.S. “fair use” here because as little as I know about that, I know far far less about copyright law in other countries. That said:
First, whether fair use applies is a fairly complex matter which depends among other things on how much of the original work is copied. While maybe not technically determinitive of the validity of a fair use defense, “the whole damn article” definitely won’t help your case when you’re trying to argue a fair use defense in federal court.
Second, I think for a fair use argument to work the way you seem to be suggesting, the quoted portions of(!) the article would have to appear in the same “work” as the commentary, but I’d imagine typically all comments in a Lemmy thread would be distinct “works.” Particularly given that each comment is independently authored and mostly by distinct authors. (Copying an entire article into a comment and following it with some perfunctory “commentary” would be a pretty transparent ham-fisted attempt at a loophole. Again, a very bad look when you’re arguing your defense in federal court.) I don’t know about your Lemmy instance, but mine doesn’t seem to say anything in the legal page that could provide any argument that a thread is a single “work.” (It does say “no illegal content, including sharing copyrighted material without the explicit permission of the owner(s).”)