{"id":1821,"date":"2012-11-21T22:24:56","date_gmt":"2012-11-21T20:24:56","guid":{"rendered":"http:\/\/planetozh.com\/blog\/?p=1821"},"modified":"2014-03-04T18:19:22","modified_gmt":"2014-03-04T17:19:22","slug":"checking-domain-blacklists-from-spamhaus-surbl-and-uribl-in-php","status":"publish","type":"post","link":"https:\/\/planetozh.com\/blog\/2012\/11\/checking-domain-blacklists-from-spamhaus-surbl-and-uribl-in-php\/","title":{"rendered":"Checking Domain Blacklists from Spamhaus, SURBL and URIBL in PHP"},"content":{"rendered":"<p>I&#39;ve been speaking lately with folks from <a href=\"http:\/\/www.spamhaus.org\/\">Spamhaus<\/a> about anti spam measure in <a href=\"http:\/\/yourls.org\/\">YOURLS<\/a> and a YOURLS plugin for this. Currently the #1 result in Google for &quot;<tt>spamhaus PHP<\/tt>&quot; is a post on Lockergnome which gets it totally wrong and provides a script that does not work, so here is a PHP script that <strong>does work<\/strong>.<\/p>\n<p>This script checks a URL (its domain part, in fact) against the 3 major black lists: <a href=\"http:\/\/www.spamhaus.org\/\">Spamhaus<\/a>, <a href=\"http:\/\/www.surbl.org\/\">SURBL<\/a> and <a href=\"http:\/\/uribl.com\/\">URIBL<\/a>.<\/p>\n<p><strong>The script:<\/strong><\/p>\n<div id=\"ig-sh-1\" class=\"syntax_hilite\">\n\n\t\t<div class=\"toolbar\">\n\n\t\t<div class=\"view-different-container\">\n\t\t\t\t\t\t<a href=\"#\" class=\"view-different\">&lt; View <span>plain text<\/span> &gt;<\/a>\n\t\t\t\t\t<\/div>\n\n\t\t<div class=\"language-name\">php<\/div>\n\n\t\t\n\t\t<br clear=\"both\">\n\n\t<\/div>\n\t\n\t<div class=\"code\">\n\t\t<ol class=\"php\" style=\"font-family:monospace\"><li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">\/**<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;* Check a URL against the 3 major blacklists<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;*<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;* @param string $url The URL to check<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;* @return mixed true if blacklisted, false if not blacklisted, 'malformed' if URL looks weird<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;*\/<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">function ozh_is_blacklisted( $url ) {<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; $parsed = parse_url( $url );<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; if( !isset( $parsed['host'] ) )<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; return 'malformed';<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; \/\/ Remove www. from domain (but not from www.com)<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; $parsed['host'] = preg_replace( '\/^www\\.(.+\\.)\/i', '$1', $parsed['host'] );<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; \/\/ The 3 major blacklists<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; $blacklists = array(<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; 'zen.spamhaus.org',<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; 'multi.surbl.org',<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; 'black.uribl.com',<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; );<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; \/\/ Check against each black list, exit if blacklisted<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; foreach( $blacklists as $blacklist ) {<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; $domain = $parsed['host'] . '.' . $blacklist . '.';<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; $record = dns_get_record( $domain );<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; if( count( $record ) &gt; 0 )<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return true;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; }<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; \/\/ All clear, probably not spam<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; return false;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">}<\/div><\/li>\n<\/ol>\t<\/div>\n\n<\/div>\n\n<p>Usage:<\/p>\n<div id=\"ig-sh-2\" class=\"syntax_hilite\">\n\n\t\t<div class=\"toolbar\">\n\n\t\t<div class=\"view-different-container\">\n\t\t\t\t\t\t<a href=\"#\" class=\"view-different\">&lt; View <span>plain text<\/span> &gt;<\/a>\n\t\t\t\t\t<\/div>\n\n\t\t<div class=\"language-name\">php<\/div>\n\n\t\t\n\t\t<br clear=\"both\">\n\n\t<\/div>\n\t\n\t<div class=\"code\">\n\t\t<ol class=\"php\" style=\"font-family:monospace\"><li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">if( ozh_is_blacklisted( $url ) ) {<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; \/\/ do something brutal (eg die() your script, yell at user, etc...)<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">}<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">\/\/ all is fine *for today*, do your regular stuff.<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">\/\/ This said, it'd be nice to recheck every couple of days<\/div><\/li>\n<\/ol>\t<\/div>\n\n<\/div>\n\n<p>Feel free to steal.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#39;ve been speaking lately with folks from Spamhaus about anti spam measure in YOURLS and a YOURLS plugin for this. Currently the #1 result in Google for &quot;spamhaus PHP&quot; is a post on Lockergnome which gets it totally wrong and provides a script that does not work, so here is a PHP script that does work. This script checks a\u2026<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[398,2,10,27],"class_list":["post-1821","post","type-post","status-publish","format-standard","hentry","category-published","tag-blacklist","tag-code","tag-php","tag-spam"],"_links":{"self":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/1821","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/comments?post=1821"}],"version-history":[{"count":0,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/1821\/revisions"}],"wp:attachment":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/media?parent=1821"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/categories?post=1821"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/tags?post=1821"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}