As it is well known by now, Google decided to remove the supplemental label from pages it adds to its supplemental index. That is unfortunate because pages that are labeled this way need some “link lovin’.” How are we going to give those pages the love they need if we are not able to identify them in the first place?
In this post, I want to take a look at some of the alternatives we have to identify supplemental pages. WebmasterWorld has exposed a new query that displays such results, but nobody knows how long it is going to last. Type this into your Google search box: site:hamletbatista.com/& and you’ll see my supplemental pages. I tested it before and after Google removed the label and I'm getting the same pages.
Danny Sullivan talks about a technique Google suggests we use to get the supplemental pages:
First, get a list of all of your pages. Next, go to the webmaster console [Google Webmaster Central] and export a list of all of your links. Make sure that you get both external and internal links, and concatenate the files.
Now, compare your list of all your pages with your list of internal+external backlinks. If you know a page exists, but you don't see that page in the list of site with backlinks, that deserves investigation. Pages with very few backlinks (either from other sites or internally) are also worth checking out.
Not sure about you, but that sounds like a little bit of work. I agree with Danny in that the best solution would be an option in Google Webmaster Central to download the supplemental pages.
Jim Boykin recommends we look for phrases that should only be in the target page; if the page does not come up for the phrase, then it has to be supplemental. Why? Because supplemental pages are partially indexed. That is a clever solution, but for a lazy SEO like me, or if you have a lot of pages on your site, that’s too much work. Imagine how many years it would take Michael Arrington to do this for TechCrunch!
While reading Jim Boykin's alternate solution, I remembered that I had promised one of my readers that I would create a tool to find such pages. To do so, I tried to solve this problem strategically. I followed the same steps I recommended in my previous post. I looked deep at the problem, surveyed the current solutions and came up with one better.
In the end, I revisited the script I created a while ago to identify link-rich and link-poor pages. After all, supplemental pages are pages that have very few links, so I only needed to do a small change to report pages with few links. At the moment my script only lists pages with a single inbound link. Also, as I decided to improve this blog appeal to non-developers, I added a simple web interface.
Here is the link to the tool [temporarily disabled due to server processing issues]
You simply need to copy your server log file to your PC/Mac and upload it via the form. Depending on the size of the log file, it may take a while to copy to my server. Once uploaded, follow the link to the analysis step and you should have a list of pages that have only one inbound link. This list is currently larger than the one Google returns because I am not filtering internal pages or pages that are disallowed in the robots.txt and meta robots tags. Maybe I can add that in the future.
I promise I will work to make the interface look nicer, too!