• 0 Posts
  • 7 Comments
Joined 5 months ago
cake
Cake day: January 27th, 2025

help-circle
  • @Fletcher Not only it is a golden mine for scrappers (AI-purposed or whatnot), but even deleted things from fediverse (and, by extension, Lemmy) continue to appear out there (e.g. Google Search), be it through federated instances, be it through direct scrapping.

    I feel like a personal example of that: I deleted my Lemmy account. Still, many of my content still linger on Google and other search engines through instances I never saw before.

    However, it’s not because fediverse is open: it’s because of how Web (or, at least, Clearnet) works. If someone can access it, it can become available for others to access. When even DRM-protected, pay-walled content still ends up being openly accessible somewhere, it’s no surprise fediverse content can, too. Everything done on Clearnet will end up on many places simultaneously, lasting any deletion: Internet Archive is a common place to find digital ghosts.

    While it seems ominous, it is thanks for this very nature that many important and/or useful content can still be accessed (e.g. certain scientific papers and studies that were politically removed by a government, certain old/ancient games that fell into corporate/market oblivion, certain books from long-gone publishers).

    To quote Cory Doctorow: “Scraping against the wishes of the scraped is good, actually”. The problem isn’t scrapping, but the intentions behind who use the scraped content, particularly if such a “who” is a corporation (such as Google and Microsoft).

    Problem is: to the eyes of a webmaster, well-intentioned scraping isn’t distinguishable from corporate scrapping. They’re all broad GETs (i.e. akin to the “all the things” meme), perhaps differing in scale, distribution and frequency, but broad GETs nonetheless. People have been setting up Anubis (the libre PoW CAPTCHA solution) or CloudFlare (the MitM corporation) to avoid AI-crawling, but they’re also becoming prone to oblivion when, say, their servers ends up disappearing forever one day, taking all their content to the realms of /dev/null: many of which are unique contents, useful contents, gone as no archiving tool (e.g. Internet Archive) could reach them.

    IMO, you’re not wrong, but scraping isn’t wrong per se, either.


  • @WhyJiffie

    oh, I too often do this, with emails, where I compose it for a long time, all the while it changes a lot

    Curiously, there seems to be a psychological factor behind this: when we’re compositing emails, we are focused on a single mail. Email composition boxes are often bigger and wider than those from social networks, and they often appear as fullscreen textareas (separated from the mail being replied, if there’s any). That’s possibly why it seems easier to do this with email composing. A tip? Notepad apps (such as Noto, Sketchbook, Joplin or even mainstream ones such as Google Keep) can mimick a composition box from emails. My previous reply to you was initially written in Noto, until I transferred it to PC. Perhaps this could help if you wish to apply the same habit for fediverse.

    that, or what reddit does: replace the username with “deleted”

    In a sense, yeah, “deleted” username placeholdering (automatically, when a person chooses to delete their own account) is also an interesting solution. However, there are some things I forgot to mention in my previous replies, one of which is GDPR’s “Right to be forgotten”, which could pose a legal obstacle for such a solution if, like Reddit, the content is restored against the user’s will (as a context: when people left Reddit to come to fediverse/Lemmy, Reddit undid many of the deletions, so they could both astroturf the Reddit platform (make it appear like they have a large userbase when they don’t anymore) AND train corporate AI with all posts and comments, and this probably led to legal issues or will lead in the future if people eventually find a Reddit’s legalese contradiction inside their ToS and decide to sue Reddit based on GDPRs rights).

    In the end of the day, it’s a complicated matter, because it feels like there’s no easy solution that could both respect community AND the user behind the content while complying with certain laws out there, especially when things can unexpectedly change in the future (e.g. corp AI managing to haunt the fediverse) and leading people to decide on nuking entire posts.


  • @WhyJiffie Disclaimer: I’m not sure if Friendica is respecting the thread format from Lemmy, in my first attempt, Friendica sent this reply as a whole new sub-thread instead of part of the previous sub-thread. Sorry if this is being sent outside the sub-thread, it’s a glitch from Friendica.

    I’m sorry for your bad experiences

    Thanks

    sometimes the person is just in a hurry or something

    On the one hand, it makes sense. Hurry is perfectly understandable, given how “modern life” often vampirizes human time (while also vampirizing our attention span, which also corroborates with, and exacerbates, the phenomenon you described as severe attention deficit).

    However, the hurry to reply is just another symptom/phenomenon brought by online activities: we’re often expected to act “now”, reacting to real-time information, prioritizing action over (deep) thought… and there’s a Brazilian saying “a pressa é inimiga da perfeição”, roughly translating to “hurry is the enemy of perfection”; things don’t need to be as fast, at least not immediately (not that we need to seek perfection: here, “perfection” it’s just euphemism for well-thought interactions).

    For example (a meta-example): this reply to your reply wasn’t written so recently. I saw your reply when it had been 10 minutes since you had sent it (11 hours ago). Then I read it, then I read it again, and again… I read it several times so I could understand all the points you shared. Even though I wasn’t going to reply immediately (i.e. as soon as I saw), I began to gather fragments from my thoughts-replies (which started to pop up inside my head as soon as I began reading), writing these fragments as notes so I could further develop and compile them, only effectively sending when my reply was complete and ready. It’s an old habit of mine, gradually writing and preparing a text/reply/post over hours or days.

    Maybe I got this habit through literature, where I often write down and compile my thoughts as they pop up. Maybe I got this habit from Geminispace (a cyberspace within the so-called smallweb/smolweb) where its protocol prefers and encourages raw text over media. Maybe it’s a fundamental part of archetypal traits from ND and/or PDs… In any case, it can be reflected as a proof-of-concept of how interactions can happen without needing such a hurry from the modern web, allowing for better interaction depth.

    Also, it can be pointed out how developing a response gradually over hours, in a way, helps with both attention deficit and anxiety. Of course, there are no simple one-size-fits-it-all solutions because each person is different, but it seems like an useful approach (saving/bookmarking what is going to be replied and developing the reply gradually over the day or over a few days, without a hurry to do that so immediately; IMHO, the Web would be slow-paced, but richer and deeper in content than it is nowadays).

    brainrot platforms like tiktok really don’t help with this worldwide issue.

    Exactly. This is also why I mentioned Geminispace in my previous paragraph: there’s a jarring contrast between its raw text format and the fast-consumption media (not that much of a difference from “fast-food”: readily available, but unhealthy) from TikTok and other mainstream “social” networks, with the former prioritizing brains and the latter prioritizing gains.

    very short, meaningless comments, which also have other properties I don’t know how to put into words.

    Another word I would think of is superficiality.

    The cases where I find deletion problematic always had something useful in them, either the post or the threads.

    Losing useful information/knowledge is frustrating, especially in a world that is becoming increasingly scarce of purposeful knowledge… Although I’m not sure how much the things I ever wrote and sent on Lemmy were that much useful for people, I guess there were possibly helpful contents (explanations, tips, etc) among hundreds of personal entries that got deleted. That’s because I deleted my Lemmy account as a whole, so I had no means to keep certain entries I wished I could keep.

    One solution could be ActivityPub allowing for a departing user to update its own actor from given posts, replacing it with a community/instance-wide actor (thus a “de-actorification” of sorts), so the activity would effectively become part of a public domain (given explicit consent from both the actor, the community and the instance, of course). But it’s not an easy thing to implement nor to fully achieve in practice, unfortunately.



  • @WhyJiffie I’ll try to reply using this platform (Friendica… I had no success with Mastodon, Tootik and Pixelfed). I tried to reply to a reply in this thread but my answer failed to federate (and Friendica doesn’t return their reply in the search box). I’m replying with the following intent: to remind about neurodivergence. I am ND myself (I’m not autistic, but I was diagnosed with schizotypal PD, and I suspect I could actually have Geschwind syndrome; in any case, I’m certainly ND because I can’t think/express nor see/perceive/feel things in “typical ways”). ND people express themselves in non-typical ways (my reply is hopefully an example of that). ND people are often mistaken as AI (and this can further deepen the alienation ND people often feel and suffer from). People often downvote content without further try to engage/explain _why_ they downvoted, and ND content is more prone to downvotes due to sincericide (exacerbate sincerity) and seemingly lack of “emotional resonance” (i.e. “cold-sounding” texts) with NT (neurotypicals). Or, ND content is simply ignored, ghosted, relegated to the void, either because NT people don’t know how to further engage with such a content, or because NT people couldn’t even bother to try and read it in the first place (people are becoming accustomed with short texts, fast content, and ND texts can be looooong). Best case scenario, ND people are replied back with superficial replies because their content couldn’t communicate what they intended to communicate. And this can be pretty infuriating/frustrating, especially because ND people often face the lack of belonging, feeling like they can’t fit anywhere… and this often leads to resigned departure, which you referred to as “permanently deleted posts”. This is something I did: I left Lemmy many months ago, partly because of the many phenomena I described: it feels frustrating to be yourself and being drown into either ghosting, downvoting, superficiality or prejudice, even though I tried not to bother… but what we write is fragments deep from our souls. I can understand the feeling of watching a reply vanishing with an entire post, it’s frustrating… just as it’s frustrating to watch a post being misread or ghosted because I was born akin to an extraterrestrial trying to communicate with fellow humans to no avail. That’s why I often find myself “nuking” my own content: because there’s no reason to keep a communication attempt that led to no meaningful and deep communication. I hope this clarifies one of the reasons why “Permanently deleted” could happen.



  • @cyrano The “problem” (actually, the feature) with those censorship algorithms is that they rely a lot on the “exact contents” of the message (“Scunthorpe Problem”), so X is probably programmed to detect the Signal’s domain and block due to the presence of such link (similar to how Facebook was/is blocking links to the largest PixelFed instances, and then they also decided to block links to DistroWatch and official websites from various Linux distros), so it’s not programmed (yet) to censor just the “hexadecimal/base64/whatever” portion of the link alone. And there’s where Tox could shine: a handle is literally a hexadecimal sequence, without Tox’s domains, without URI Schemas, just a bunch of digits and letters from A to F.

    I don’t know why Tox isn’t mentioned as a “instant messaging platform for whistleblowers”: it got Onion (Tor) tunneling possibility (as well as tunneling it through I2P outproxies because it actually accepts any kind of SOCKS5 proxy), it’s registration-less (even Matrix needs registration) so it’s effectively anonymous IMO.

    SimpleX seems to be that, too, although I didn’t have the opportunity to use it more than I used Tox. But from the little I’ve used it, it’s similar to Signal in the sense that it’s a link (and a large link) and not simply a hash/hex sequence.


  • @rimu @Bronzebeard On the one hand, when Deep Seek “doesn’t know” about a thing (i.e., something not present the training data), it’ll state it clearly (I’m not sure if the image will be sent as I’m not using Lemmy directly to reply this):

    The context of the image is the following: I asked DeepSeek about “Abnukta”, an obscure and not-so-much-known Enochian term that is used during one of the invocations of Lilith, and DeepSeek replied the following:

    “Abnukta is a term that does not have a widely recognized or established meaning in mainstream English dictionaries or common usage. It could potentially be a misspelling, a neologism, or a term from a specific dialect, jargon, or cultural context. If you have more context or details about where you encountered the term, I might be able to provide a more accurate explanation. Alternatively, it could be a name or a term from a specific field or community that is not widely known”.

    So, the answer that the user Rimu received is not regarding something “unknown” to the LLM (otherwise it’d be clearly stated as that, as per my example), but something that triggered moderation mechanisms. So, in a sense, yes, the LLM refused to answer…

    However… On the other hand, western LLMs are full of “safeguards” (shouldn’t we call these as censorship, too?) regarding certain themes, so it’s not an exclusivity of Chinese LLMs. For example:
    - I can’t talk about demonolatry (the worshiping of daemonic entities, as present in my own personal beliefs) with Claude, it’ll ask me to choose another subject.
    - I can’t talk with Bing Copilot about some of my own goth drawings.
    - Specifically regarding socio-economics-politics subjects, people can’t talk with ChatGPT and Google Gemini about a certain person involved in a recent US event, whose name is the same as a video-game character known for wearing a green hat and being the brother of another character that enters pipes and seeks to set free a princess.
    - GitHub Copilot refuses (in a blatant Scumthorpe Problem) to reply or suggest completion for code containing terms such as “trans” or “gender” (it’s an open and known issue on GitHub, so far unanswered why or how to make Copilot answer).

    But yeah, west is the land of the freedom /s