Like Flies to Project Honeypot: Revisiting the CGI proxy hijack problem

CGI proxy hijacking appears to be getting worse. I am pretty sure that Google is well aware of it by now, but it seems they have other things higher on their priority list. If you are not familiar with the problem, take a look at these for some background information:

  1. Dan Thies take and proposed solutions

  2. My take and proposed solutions

Basically negative SEOs are causing good pages to drop from the search engine results by pointing CGI proxy servers’ URLs to a victim’s domain, and then linking to those URLs so that search engine bots find them and the duplicate content filters drop one of the pages—inevitably the one with the lowest PageRank, the victim’s page.

As I mentioned in a previous post, it is very likely that this would be an ongoing battle, but that doesn’t mean we have to lay down and do nothing. Existing solutions require the injection of a meta robots noindex tag on all web pages if the visitor is not a search engine. In this way search engines won’t index the proxy-hijacked page. Unfortunately, the proxies are already altering the content before passing it to the search engine. I am going to present a solution I think can drastically reduce the effectiveness of such attacks. Read more

A Never-ending Battle — Protecting your content from CGI hijackers

frogsoldier1.jpgIn computer security we have several ongoing battles: the virus/spyware writers vs. the antivirus vendors, the spammers vs. the anti-spam vendors, the hackers vs. the security experts. Add to that list the search engine marketers vs. the CGI hijackers.

Dan Thies, the undisputed keyword research master, used his influence in the search engine marketing industry to bring the problem we have blogged about in the past to a wider audience. Specifically, the issue is the CGI proxy hijacking. He mentioned a couple of solutions, but as I pointed out in my comment, both solutions have weaknesses. I recommended a stronger countermeasure, similar to what is in use in the anti-spam industry at the moment. But after reflecting on my proposed solutions and others’, it is clear in my head that this is a never-ending battle. We can create defenses to current techniques and attackers will adapt and make their attacks smarter. Read more

The Never Ending SERPs Hijacking Problem: Is there a definite solution?

hijacker.jpgIn 2005 it was the infamous 302, temporary redirect page hijacking. That was supposedly fixed, according to Matt Cutts. Now there is a new interesting twist. Hijackers have found another exploitable hole in Google: the use of cgi proxies to hijack search engine rankings.

The problem is basically the same. Two URLs pointing to the same content. Google's duplicate content filters kick in and drop one of the URLs. They normally drop the page with the lower PageRank. That is Google's core problem. They need to find a better way to identify the original author of the page.

When someone blatantly copies your content and hosts it on their site, you can take the offending page down by sending a DMCA complaint to Google, et al. The problem with 302 redirects and cgi proxies is that there is no content being copied. They are simply tricking the search engine into believing there are multiple URLs hosting the same content.

What is a cgi proxy anyway? Glad you asked. I love explaining technical things :-) Read more