2024-09-16
I spent more than an hour trying to figure out why my server had a load of nearly 40. All I discovered is that load went down when I shut down Emacs Wiki. Well, I needed to sleep and I‘ve got plans for the next few days so I shut it down while I slept hoping that the misconfigured spider is fixed or the inept programmer discovers their mistake. Just another day in the Butlerian Jihad. Some misguided soul probably wanted to download it all and wrote a broken web crawler and when that got blocked they bought some nice scaling infrastructure from Amazon, Hentzner, OHV or Alibaba Cloud or whatever they are called, allowing them to use a gazillion different IP numbers that will eventually lead me to implement some sort of cloud service provider block. – Alex
Switched Emacs Wiki back on after a few hours of sleep and it did fine. But then it restarted again… at 18:00, 19:00, 21:00, 22:00… and so I have switched Emacs Wiki off again. Time to ban some networks!
Anybody interested in my banning of IP ranges and possibly interested in me reverting any of these, take a look at ban-cidr … from a network that isn’t banned, I guess. 😏
2024-09-17. This continues to keep me busy and angry every evening. Too bad I don’t have a real fast network-lookup to firewall ban pipeline.
In any case, I added over a hundred Chinese networks to the firewall rules and I’m seriously considering blocking the whole country for a week. It seems that most of the offenders are networks run by China Telecoms and China Mobile.
2024-09-18. So far, so good. Load stays below two.
e3d3 last week I added emacswiki to my newsfeed by using openrss.org. When reading this post I was worried that this caused the disturbance, although I expected it to refresh the feeds twice a day. I don’t know much about networking and to be sure I removed the feed, at 20240916 at 18:00 hour, plus minus 5-10 minutes. Later I became a little paranoid that I had caused it by using the “random page” link too often (20-30 times). Wish I could help more. You can find me sometimes on #emacs and leave a message there if you want to contact me. Wish you good Luck, and a good mood. Best regards.
No worries. These are badly programmed crawlers that will visit every single old revision of a page, download the individual feeds for every page, and so on. They have no concept of what is important and what is not. – Alex