Help - Search - Members - Calendar
Full Version: Referrer Spam?
Movable Type Community Forum > Additional Resources > Tips and Tricks
KarenHuang
ohmy.gif Is there such a thing as referrer spam? Apparently some porn sites are linking to my blog.... Paris Hilton in particular seems to be a big fan!

Not that I have anything against pornographers, but I don't see my link anywhere on the site. That is, if I live through the multiple pop up windows, usually I don't.
LisaJill
Yes, its a huge problem with websites.

The best you can do is find their IP and ban them using .htaccess.

Good luck. =)
KarenHuang
Oh, I never knew that. Thanks for replying so soon.

I will try it out.

Another question... do you know how it works?
maddy
This Google Search: "referer spam" OR "referrer spam" should give you more information on how it works, and how you can try to stop it. smile.gif
LisaJill
Wired Explanation

One way to fight it
Tom Alday
i have this same problem (damn you Paris Hilton!!) and was wondering how to end it, i came across this guy that offers up an .htaccess solution

CODE
# Options +FollowSymlinks
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite1.com.*$ [OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite2.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite3.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite4.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite5.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite6.com.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?spammersite7.com.*$ [NC]
RewriteRule \.*$ http://www.some-other-website.com [R,L]


the only problem is i have no idea where to put this in my .htaccess file, can anyone with more .htaccess experience than me help out?
arvind
you can put it anywhere !
Tom Alday
just a follow up for anyone with this same problem

this code:

CODE
SetEnvIfNoCase Referer ".*(casino|gambling|poker|porn|sex|paris|gabriola|nude|xxx|hilton|pics|video).*" BadReferrer
order deny,allow
deny from env=BadReferrer


in my .htaccess file has hopefully solved this issue, i haven't gotten a paris hilton/porn referer in a couple of hours!
KarenHuang
Tom, thanks for making it so easy for us.

I just tried it. Hope it works.
Tom Alday
2 weeks later it's still working great, whenever a new offensive porn referer shows up i just add it to the .htaccess and reupload it, works perfectly.
KarenHuang
Keeping my fingers crossed. There's to way of checking if it's working, is there?

I'm a bit impatient. tongue.gif
KarenHuang
Um, could you explain your code a little?

CODE
SetEnvIfNoCase Referer ".*(casino|gambling|poker|porn|sex|paris|gabriola|nude|xxx|hilton|pics|video).*" BadReferrer
order deny,allow
deny from env=BadReferrer


Does this mean that any referrer containing the words listed would be banned? Whether or not the domain name itself contains those words?

Does it mean that if I do a search for paris hilton on google, and click on my own site, the referrer (eg: google.com.sg/search?q=paris+hilton+snog&hl=en) will not show?

I'm trying to check if the code is working, since this is the first time I'm working with .htaccess.
imabug
QUOTE
2 weeks later it's still working great, whenever a new offensive porn referer shows up i just add it to the .htaccess and reupload it, works perfectly.

Keep in mind that Apache reads the .htaccess file every time a page in that directory is accessed. The larger and more complex your htaccess becomes, the longer it will take for Apache to process the file, which could eventually result in slow response times.
QUOTE
There's to way of checking if it's working, is there?

Any rejections will probably be written to the server error log. If you have access to it, you can check in there to see what IP addresses were rejected.
Tom Alday
QUOTE (KarenHuang @ Jan 28 2004, 05:35 PM)
Um, could you explain your code a little?

CODE
SetEnvIfNoCase Referer ".*(casino|gambling|poker|porn|sex|paris|gabriola|nude|xxx|hilton|pics|video).*" BadReferrer
order deny,allow
deny from env=BadReferrer


Does this mean that any referrer containing the words listed would be banned? Whether or not the domain name itself contains those words?

Does it mean that if I do a search for paris hilton on google, and click on my own site, the referrer (eg: google.com.sg/search?q=paris+hilton+snog&hl=en) will not show?

I'm trying to check if the code is working, since this is the first time I'm working with .htaccess.

OK you have the word "porn" in the .htaccess, now if there was a link to your site from www.porn.edu and someone clicked it they would get a "404 error" page, it checks the domain name thats sending the referer (porn.edu) and checks your .htaccess file to see if it's allowed, if not it's blocked

i suppose you could test it by temporarily adding a word from a domain you know links you and clicking their link to you.
KarenHuang
It's not working. Paris was back this morning. And I tested with the word typophile. But it still turns up in my referrers.

Is there something I'm not doing right?
Sonance
QUOTE
It's not working. Paris was back this morning.

The one mistake a lot of people make is putting the IP address of the spam referral site in their .htaccess denials, rather than the IP address which planted it.

For example, denying the IP address of the Paris Hilton Sex Video (which has appeared in my referral logs too) won't have much effect, because that's not the IP address that's planting the spam there.

However, spotting the true IP address responsible for the spamming is quite easy. There's a few ways to do it.

1) If your web host/server uses Awstats to generate your stats, then consult your latest log file. Look at the list of "hosts", ie the IP addresses that have actually visited or pinged your web site. What you need to look out for is any visiting IP address that consumes 0 bytes of bandwidth. This just means they're pinging your HTTP header to generate the referral spam, rather than actually visiting a page.

2) Download your raw stats file and load it up in a decent text editor (preferably one that can display a full navigable list of search results, such as UltraEdit). Do a search for the string "HEAD. This will show you a list of IPs that have (almost) exclusively just pinged your HTTP header and it will show you the spoof referral they're generating. Be careful here, because a few legitimate sites (such as server statistic gatherers or indexers) can show up here. But more often than not it's a referral spammer.

3) Not all spam referrals exclusively punch the HEAD though. Some will actually GET your index page too. These can be a bit trickier to spot.

By way of an example, these are the search results of the "HEAD string within my blog site's latest referral log:

CODE
206.129.0.135 - - [14/Jan/2004:00:40:31 +0000] "HEAD / HTTP/1.0" 200 0 "http://www.riaa.com" "Referrer Advertising System"

I seem to get hit by referral spam for the RIAA whenever I post a new blog entry that mentions mp3s, digital downloading, or anything like that. 206.129.0.135 (NOT www.riaa.com) gets added to my denial list.

CODE
172.132.46.68 - - [15/Jan/2004:07:50:10 +0000] "HEAD / HTTP/1.1" 200 0 "http://www.starprose.com" "StarProse Referrer Advertising System 2004"

See -- it's so much easier when these referral spammers identify themselves so easily. 172.132.46.68 gets banned.

CODE
172.167.114.207 - - [26/Jan/2004:16:52:44 +0000] "HEAD / HTTP/1.1" 200 0 "http://www.joe2004.com" "StarProse Referrer Advertising System 2004"

172.167.114.207 - - [26/Jan/2004:21:17:20 +0000] "HEAD / HTTP/1.1" 200 0 "http://www.clark04.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Crazy Browser 1.0.5; Alexa Toolbar)"

WTF?! Presidential candidate spam. Oddly enough, the same IP address is responsible for referral spamming the official web sites for both democratic candidates.

I've only started doing this within the last month, seeing as I've only just recently launched a new blog, but these are the IP addresses I'm currently denying:

206.129.0.135
172.132.46.68
172.137.45.63
206.129.0.134
206.129.0.132
206.129.0.133
172.167.114.207
205.236.189.35
69.7.170.46
213.123.247.181
217.118.39.51
200.41.4.3
200.217.186.2
63.110.140.28
195.241.96.171
64.173.247.185
200.161.74.26
194.249.174.250

Another tip -- you might want to create a "friendly" 403 error page. This is what people are going to be seeing if they're in my deny list. Given that this blocking procedure is manual, there's always the risk a legitimate IP will end up in it, so you might want to use the 403 error page to briefly explain why they might have been denied and who to contact to resolve the issue.

Oh, and if your web site doesn't actually display a list of referrers, creating denial lists probably isn't worth bothering with. It's only bloating your .htaccess file, which EVERY incoming visitor is going to hit -- and 99.99% of those visits are from genuine visitors. Of course, if you're being hit dozens/hundreds/thousands of times a day by the same spam referrals, then you should add it. But if it's just a "casual" referral spammer who's visiting once every few days, it's probably not worth the effort.
patch
When I add,

CODE
SetEnvIfNoCase Referer ".*(casino|gambling|poker|porn|sex|paris|gabriola|nude|xxx|hilton|pics|video).*" BadReferrer
order deny,allow
deny from env=BadReferrer


I get a 500 Internal Server Error when trying to access mt.cgi.

My complete .htaccess file looks like this:

CODE
<Files 403.shtml>
order allow,deny
allow from all
</Files>

deny from 68.163.90.10
deny from 63.247.85.11
deny from 63.247.85.10
deny from 146.82.174.13
deny from 146.82.174.10
deny from 69.61.11.163
deny from 216.150.91.66

SetEnvIfNoCase Referer ".*(casino|gambling|poker|porn|sex|paris|gabriola|nude|xxx|hilton|pics|video).*"
BadReferrer
order deny,allow
deny from env=BadReferrer


What am I doing wrong?

Regardless, sonance's post is very enlightening and given the method of referral spam, I question the utility of the filtering of domains. Perhaps an '.htaccess referral spam clearinghouse' like MT-Blacklist would be considered by someone?
KarenHuang
I just realised this!

If you're using textism's refer, it already comes with a built-in anti-spam feature. It's in refer.php. Just add on the words you want to block, like so:

CODE
/*    Fill these in to halt the recording of unwanted referrals  
     (e.g., an overly frequent google search, or a robot that
     inserts a referrer for every page it visits) by matching a
     distinct phrase. To add more, just duplicate a line and put
 a different match phrase inside 'quotes'.                    */

    $rcfg['exclude'][] = 'viagra';
    $rcfg['exclude'][] = 'nude';
    $rcfg['exclude'][] = 'sex';
    $rcfg['exclude'][] = 'porn';
    $rcfg['exclude'][] = 'janet-jackson';
    $rcfg['exclude'][] = 'paris-hilton';
    $rcfg['exclude'][] = 'xxxx';


This method does not require any modifications to htaccess.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2010 Invision Power Services, Inc.