Open Proxy RBL Lookups in PHP
Thursday, May 26th, 2005If you’re developing a web application that could be susceptible to what’s called a “hitbot”, or a script/program that attempts to repetetively impersonate a task such as voting on a poll, posting comments to a blog, brute-forcing a password protected site, clicking ad banners, etc, you want to implement some reasonable protection from these attacks. The simplest form of attack is a program such as ClickBot that just repetitively makes a GET or POST request to a server. To defeat this you just track $_SERVER['REMOTE_ADDR'] in a log or database and prevent multiple hits from the same IP address. The next type of attack uses a slightly more advanced piece of software such as Smart Hitbot that takes a list of proxy servers and connects to the target through these.
Smart Hitbot can hit one page first, picking up the proper referrer tag and/or cookies before hitting the second, so those prevention methods won’t help. What’s important to understand about proxy servers is not all proxies are equal. Some will pass HTTP_PROXY_CONNECTION = keep-alive or something along those lines, but merely detecting it’s a proxy won’t do you any good, as lots of people are connected to the net through caching proxies or censoring firewalls. But many (so called non-anonymous) proxies pass HTTP_FORWARDED, HTTP_X_FORWARDED_FOR, HTTP_VIA, HTTP_XROXY_CONNECTION, HTTP_PROXY_CONNECTION, or HTTP_CLIENT_IP to give away the original IP address. If one of these is detected you need to log BOTH the source IP and the proxy IP address. I’ve seen some scripts that will detect these fields then log only the original IP address, leaving them open to an attack where a machine pretends to be an open proxy and hits a site directly, forging the HTTP_X_FORWARDED_FOR field with random IP addresses. You need to make reasonable accommodations though; if two people connect through the same proxy to your site it’s not wise to assume they’re cheating the system and fire off warning alarms and autobans.
The other type of proxy is the most difficult to detect; it won’t pass any client IP address or give any clues that it’s making the request for someone else, it looks exactly like a normal client. Attackers will compile huge databases of open proxies by querying popular websites or using bots that search Google for lists, then meticulously prune them to find fast, completely anonymous proxies using programs such as Charon. Fortunately you have a defense against these "e;super-proxies”, known as an RBL, or Realtime Blackhole List. SORBS maintains an RBL for open http proxies, and it’s trivial in PHP to check a connecting IP address against the blacklist. For example:
/* function check_rbl()
* Checks to see if the client is listed in any proxy blacklists
* Returns true if the host if blacklisted, false if not
*/
function exists_in_rbl() {
$rbls = array('http.dnsbl.sorbs.net', 'misc.dnsbl.sorbs.net');
$remote = $_SERVER['REMOTE_ADDR'];
if (preg_match(”/([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)/”,
$remote, $matches)) {
foreach ($rbls as $rbl) {
$rblhost = $matches[4] . “.” . $matches[3] . “.” .
$matches[2] . “.” . $matches[1] . “.” . $rbl;
$resolved = gethostbyname($rblhost);
if ($resolved != $rblhost) {
return true;
}
}
}
return false;
}
A word of warning with RBLs: some sites (SORBS is notorious for this) are very trigger happy when it comes to adding addresses or even entire subnets they believe are zombie networks to the blacklist. If you get a single connection from a blacklisted IP, don’t put your site in to DEFCON 1 and launch ICBMs at the client. A simple warning in the log will do, and if start getting tens/hundreds/thousands of blacklist positives in a small timeframe THEN you can set off the sirens. Note that during an attack like this all you can do is silently ignore blacklisted clients or send them a nasty warning. If your site automatically goes in to lockdown of some form you open yourself to a denial of service attack.



