Skip to main content



Leila Khaled: Who planted terrorism in our area? Some came and took our land, forced us to leave, forced us to live in camps. I think this is terrorism. Using means to resist this terrorism and stop its effects - this is called struggle. https://wordsmith.social/protestation/quotes#quote9220


Lenin: Every revolution means a sharp turn in the lives of a vast number of people. Unless the time is ripe for such a turn, no real revolution can take place. And just as any turn in the life of an individual teaches him a great deal and brings rich experience and great emotional stress, so a revolution teaches an entire people very rich and valuable lessons in a short space of time. https://wordsmith.social/protestation/quotes#quote9221


Lenin: During a revolution, millions and tens of millions of people learn in a week more than they do in a year of ordinary, somnolent life. For at the time of a sharp turn in the life of an entire people it becomes particularly clear what aims the various classes of the people are pursuing, what strength they possess, and what methods they use. https://wordsmith.social/protestation/quotes#quote9222


Short Planned Maintenance Tonight


My apologies if this is inconvenient, I opted to do it on shorter notice without a set hour because (a) there's not a lot of activity on the server and (b) I'm really impatient.

I'm doing a hardware upgrade that requires rebooting the network storage backend which will bring down everything for a short time. It should take well under 30 minutes to do the hardware swap and most of the downtime is just going to be the database starting back up (which often takes in the range of another 30 minutes).

As part of this I'll also be deploying some software updates that require a reboot to take effect.



WTF?


I honestly haven't the foggiest idea how this happened, but apparently the DNS settings got changed a few days ago on the servers with absolutely no explanation (and to junk nonsense settings for some reason). I'm going to keep an eye on them to make sure they don't change again.

Additionally I think that created a cascade that caused the other problems.

Any posts you've made over the past 2-3 days haven't been sent to other servers, but will start sending now.

As far as the other problems, I think when that happened it caused so many processes to lag and take way longer and more resources than usual as any time it tried to contact another server it timed out on the dns request.



DOS Overload


There's been some recent outages of the server, the root cause I've tracked down to the server getting overloaded with requests (mostly updates from other servers). Those updates have been coming in faster than the server can process them and preventing other requests from coming through.

I've made some tweaks that I believe have resolved it, fingers crossed.

Technical explanation:

The servers ran out of php-fpm threads to handle requests. It was configured with static count of 30 each (60 total). They were definitely impacted significantly by memory leaks which kept the count low.

I've changed it from static to ondemand and increased the count to 100 each, I'll probably go in and increase it again since it's still pegged at that limit almost constantly. But thankfully running on-demand seems to be keeping the memory usage per thread drastically lower.

Where the static assignment of 30 was eating up 8GB of ram, 100 on-demand threads is only taking up 1.3GB.

I'm going to increase it until it's either hitting memory constraints or it's no longer constantly at full capacity.

in reply to Server News

There's definitely some sort of time and code problem involved as it hit again this morning even with the previous changes, though this time it only impacted updates (making posts/comments/likes, getting new posts). I think reading was unaffected because those operations are faster and require significantly less memory.

For whatever reason, sometime around midnight the server gets hit with a bunch of requests that all seem to lock up, eating up large quantities of memory and then won't exit. (With on-demand the threads exit after 10s of being idle, there was over 100 threads running continuously from midnight until I killed them around 9am). Likewise there was a very massive flood of updates from other servers corresponding to that, so I think it might just be a bunch of large servers sending bulk updates or some such.

New tuning to handle that: I put firmer time limits into PHP to prevent threads from running forever, there's two options for setting max times and the first was getting ignored (I think friendica overrode it? the second should override that and kill any threads going too long)

In addition to that, I set up a rate limiter to the inbox endpoint (where other servers send updates to), this should help keep that from overloading the server (majority of the time it'll just be slowing them down by a second or two unless the server is overloaded, at which point the rate limit should help get it accessible for users)