I have a tiny little instance that’s being absolutely overwhelmed after I connected it to other communities. I’ve run a script to give me something like 40K posts to toss off to the purge API, but somehow my disk usage is expanding while this purge is going on. My disk usage is being caused by all the media, but I’m sure how to nuke media from outside of the instance efficiently. The API calls are kind of slow. I’d rather just issue a direct command to delete the media from existence, but I haven’t been able to find where the delete tokens for posts are stored to just rapid fire issue the command from within my server (and thus not have to stagger my calls to not be rate limited)
Can someone help me? I feel like there’s something pretty simple I’m overlooking here.
EDIT 1: Running some diagnostics, I learned that 10GB of my disk is media and 10GB is the activity table (Thanks @King@lemm.ee for pointing that out to me)
I am still left wondering how to purge the 10GB of worthless media in a way that doesn’t leave everything corrupted. Of course I can just navigate to where it is on disk and just deleted, but this feels like a bad idea. My attempt to just run purge API calls has been stymied by rate limiting. Congrats to lemmy for that, but really sucks for me who needs to delete a lot of files.
I’ll upvote, that’s the best thing I can do for you. I have completely no idea how to help you, but maybe with more upvotes people who do know see your post!
What table is the culprit? I have a cron job to shut lemmy down at 3:00am every morning and I run a TRUNCATE activity via the psql utility. If I didn’t do that, my database size would swell to 50GB or more.
That’s a good point. I’ve just been assuming that the media is the issue, but perhaps it’s just the pure database 🤔 Does doing a truncate purge the media? If not, wouldn’t I just be orphaning all these pictures, etc that have been downloaded? Also what about the fallout of your own users? I don’t really want to drop the content that was created on the instance itself
Unfortunately, it a truncate does not purge the media. The media is controlled by pict-rs and it has its own database. I cannot speak to fallout of my own users because my Lemmy instance is strictly my own. I don’t want to get into a situation where I am hosting accounts and have to deal with moderation and abuse. There are a lot of legalities surrounding this and I don’t need the headache.
I suggest trying to unsubscribe from each and every community that is delivering to you and figure out why you are still getting incoming data.
Media isn’t federated. The media should just be referenced with a link to the original source.
Normally, the largest use of disk space is the Activity table. It is stored for six months, and only useful for debugging. Below is the Issue, along with SQL commands to check and purge this debugging table. Let us know if this was the issue
Media absolutely gets federated. My pictrs folder is 10GB. Another 10GB is the activity table, so I tip my hat to you for finding that. I still have a very significant amount of worthless data on my disk though