On this page
Holy hackamole, bots are a problem!
Setting up the server
This site runs on a small VPS which costs me less than a cup of coffee each month. Given I wanted to be able to precisely control my server software, PHP version, extensions, libraries, mail system etc. and I have the technical knowledge to set these things up myself, it seemed a no-brainer as the cheapest option. My TLS certificate is provided free by Let's Encrypt and updated automatically as needed.
I'm not naive, either. I've been in this industry a while, I know there are people and - more immediately and commonly - automated bots who will probe any public server for weaknesses, attempt to identify what it's running, attempt to gain access and breach every possible avenue. SSH, mail server, web server and anything else exposing a socket are fair game for targeting. Which is why it's important to make sure you know how to configure at least the basics of hardening and security on your system before it's exposed.
But my god, I did not realise just how many bots attempt to breach even a random, tiny server like mine.
In the space of one week, I accumulated 10MB of log lines just rejecting unauthorized SSH connections. That's tens of thousands of attempts. None were successful, because I've disabled root login, don't have a stupid or obvious password and only have the one user - my own account - who is allowed to authenticate. My username is also not anything common or obvious like admin, user etc. which are all the types of username tried by these bots.
Here's a small sample from the logs:
Mar 28 000228 sys sshd: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=220.127.116.11 user=root Mar 28 000231 sys sshd: Failed password for root from 18.104.22.168 port 40640 ssh2 Mar 28 000234 sys sshd: Failed password for root from 22.214.171.124 port 40640 ssh2 Mar 28 000238 sys sshd: Failed password for root from 126.96.36.199 port 40640 ssh2 Mar 28 000240 sys sshd: Received disconnect from 188.8.131.52 port 4064011 [preauth] Mar 28 000240 sys sshd: Disconnected from authenticating user root 184.108.40.206 port 40640 [preauth] Mar 28 000240 sys sshd: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=220.127.116.11 user=root Mar 28 000318 sys auth: pam_unix(dovecot:auth): check pass; user unknown Mar 28 000318 sys auth: pam_unix(dovecot:auth): authentication failure; logname= uid=0 euid=0 tty=dovecot firstname.lastname@example.org rhost=18.104.22.168 Mar 28 000327 sys sshd: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=22.214.171.124 user=root Mar 28 000329 sys sshd: Failed password for root from 126.96.36.199 port 58500 ssh2 Mar 28 000332 sys sshd: Failed password for root from 188.8.131.52 port 58500 ssh2 Mar 28 000335 sys sshd: Failed password for root from 184.108.40.206 port 58500 ssh2
A few IP addresses in particular, all apparently originating in China, seemed to be responsible for the majority of attempts. These are now blocked on all ports. Additionally, I now disable the SSH port entirely on the firewall except for any IP address except mine.
Completely block an IP address on Linux
To drop all incoming packets on all ports from a particular IP, run the following command, obviously replacing the IP below with the one you want to block. Bear in mind iptables rules are not persistent and you will need to save/restore them on restart.
iptables -A INPUT -s 220.127.116.11 -j DROP
But better yet, rather than maintaining iptables rules yourself, install a tool like Fail2Ban to monitor commonly targeted services and automatically block IP addresses attempting suspicious behaviour.
The web server
You cannot list the contents of a directory on my Apache installation (
Options -Indexes), in fact anything that's not a static file in the public document root (images and the like) will be routed to the blog application, which will in turn serve a 404 for unrecognized URLs.
This of course hasn't stopped bots attempting to probe my server for common application-level vulnerabilities and in particular, whether it's running WordPress (it isn't). I see a lot of log entries requesting
/xmlrpc.php, both of which can be used not only to detect WordPress (and maybe its specific version) but also as an entry-point for any known exploits, as well as just trying common admin username and password combinations.
Requesting paths associated with WordPress, even once, now gets you a temporary ban. It's simply not something any legitimate traffic will do on my site. I do the same for a few other particularly common HTTP attack vectors.
There are other bots which try to probe different platforms and vulnerabilities and I can't guard against everything - it's a public server, I can't really stop people requesting invalid paths.
What I can do for these other cases, though, is throttle these requests at the application level. I have a rate-limiter in place which does a few things on my site to mitigate bot activity and network abuse. One of these is if an IP makes too many requests resulting in a 404 in a short period of time, it is temporarily blacklisted and will start getting a 429 very early on in the request process, before more computationally heavy activities start to take place.
I also mandate HTTPS for my site, via redirection from HTTP and Strict-Transport-Security headers. This is mostly for my benefit when logging in to my own site (though I do have a comments form), but it is also a best-practice on any site, regardless of whether you collect sensitive information from users or not.
And the mail server
No part of an exposed system goes unprobed and unsullied by the hacker bots. There are many attempts in the logs to access my IMAP server with non-existent email addresses (again, common things like admin@, customer@ etc.)
I don't blacklist these IPs unless they're particularly persistent and bothersome (again via automated tools to monitor the traffic and respond at defined thresholds), but I do have a 5 second authentication failure delay, so at least these attempts are forced to be spread apart.
All comments are pre-moderated and will not be published until approval.
Moderation policy: no abuse, no spam, no problem.
Why type hinting an array as a parameter or return type is an anti-pattern and should be avoided.
Leveraging the power of JSON and RDBMS for a combined SQL/NoSQL approach.
Musings from a Reddit thread
Life with a newborn baby aka why I sometimes go long periods of time without making any new posts.
Maximise performance with load once scripts, kept in long-running memory