Posts

Link Mass: How to determine how much effort it takes to rank for any particular keyword phrase

Based on the emails and response I received for my contribution to the “Link Building Secrets” project, I know that I am not the only one that loves to use metrics to measure how close I am to my goals. Thanks to everyone for your emails and encouraging comments. In this post I want to reveal another useful metric I use for our internal and client projects.

When you check the backlinks of sites ranking for competitive keywords (terms with many search results) you see that those sites have a large number of links pointing to them. But if you count the links of the top ten (using Yahoo Site Explorer, as the rest of the backlink checkers are not very useful), you notice that the results at the top don’t necessarily have more links than the ones at the bottom. This is the case because each link carries a unique rank-boosting weight (real PageRank and other link-value factors in the case of Google) that contributes to the ranking of the page for that particular term. In order to simplify things, I like to refer to the combinations of positive and negative link value factors of a page as its Link Mass. Read more

PageRank: Caught in the paid-link crossfire

Last week the blogosphere was abuzz when Google decided to ‘update’ the PageRank numbers they display on the toolbar. It seems Google has made real on its threat to demote sites engaged in buying and selling links for search rankings. The problem is that they caught some innocent ones in the crossfire. A couple of days later, they corrected their mistake, and those sites are now back to where they were supposed to be.

The incident reveals that there is a lot of misunderstanding about PageRank, both inside and outside the SEO community. For example, Forbes reporter Andy Greenberg writes:

On Thursday, Web site administrators for major sites including the Washingtonpost.com, Techcrunch, and Engadget (as well as Forbes.com) found that their “pagerank”–a number that typically reflects the ranking of a site in Google

He also quotes Barry Schwartz saying:

But Schwartz says he knows better. “Typically what Google shows in the toolbar is not what they use behind the scenes,” he says. “For about two and a half years now this number has had very little to do with search results.”

There are two mistakes in these assertions:

  • The toolbar PageRank does not reflect the ranking of a site in Google. It reflects Google’s perceived ‘importance’ of the site.

  • The toolbar PageRank is an approximation of the real PageRank Google uses behind the scenes. Google doesn’t update the toolbar PageRank as often as they update the real thing, but saying that it has little to do with search results is a little farfetched.

Several sites lost PageRank, but they did not experience a drop in search referrals. Link buyers and sellers use toolbar PageRank as a measure of the value of a site’s links. By reducing this perceived value, Google is clearly sending a message about paid links. The drop is clearly intended to discourage such deals.

Some ask why Google doesn’t simply remove the toolbar PageRank altogether so that buyers and sellers won’t have a currency to trade with. At first glance it seems like a good idea, but here is the catch—the toolbar PageRank is just a means of enticing users to activate the surveillance component that Google uses to study online behavior. Google probably has several reasons for doing so, but at minimum it helps measure the quality of search results and improve its algorithms. If Google were to remove the toolbar PageRank users would have no incentive to let Google ‘spy’ on their online activities. Read more

Popularity Contest: How to reveal your invisible PageRank

podium.jpgLet's face it. We all like to check the Google PageRank bar to see how important websites, especially ours, are for Google. This tells us how cool and popular our site is.

For those of us who are popularity-obsessed, the sad part is that the other search engines do not provide a similar feature, and Google's visible PageRank is updated only every 3 months (the real PageRank is invisible). This blog is two months old and doesn't have a visible PageRank yet, but I get referrals from many long tail searches, ergo it has to have a PageRank already. 
How can you tell what your PageRank is without waiting for the public update? Keep reading to learn this useful tip. This technique is not bulletproof, but you can get a rough estimate of your invisible PageRank — and how important your pages are for the other search engines as well — by studying how frequently your page is indexed. Read more

Google's Architectural Overview (Ilustrated) — Google's inner workings part 2

For this installment of my Google's inner workings series, I decided to revisit my previous explanation. However, this time I am including some nice illustrations so that both technical and non-technical readers can benefit from the information. At the end of the post, I will provide some practical tips to help you improve your site rankings based on the information given here.

To start the high level overview, let's see how the crawling (downloading of pages) was described originally.

Google uses a distributed crawling system. That is, several computers/servers with clever software that download pages from the entire web Read more

Google's architectural overview — an introduction to Google's inner workings

Google keeps tweaking its search engine, and now it is more important than ever to better understand its inner workings.

Google lured Mr. Manber from Amazon last year. When he arrived and began to look inside the company’s black boxes, he says, that he was surprised that Google’s methods were so far ahead of those of academic researchers and corporate rivals.

While Google closely guards its secret sauce, for many obvious reasons, it is possible to build a pretty solid picture of Google's engine. In order to do this we are going to start by carefully dissecting Google's original engine: How Google was conceived back in 1998. Although a newborn baby, it had all the basic elements it needed to survive in the web world.
Read more

LinkingHood v0.1 alpha — Improve your PageRank distribution

As I promised to one of my readers, here is the first version of the code to mine log files for linking relationship information.

I named it LinkingHood as the intention is to take link juice from the rich to give to the poor linking sites.

I wrote it in Python for clarity ( I love Python :-) ) . I was working on an advanced approach involving matrices and linear algebra. After reading some of the feedback regarding the article, it gave birth to a new idea. To make it easier to explain, I decided to use a simpler approach . This code would definitely need to be rewritten to use matrices and linear algebraic operations. (More about that in a later post). For scalability to sites with 10,000 or more pages, this is primarily an illustration and does everything in memory. It’s also extremely inefficient in its current form.

I simply used a dictionary of sets. The keys are the internal pages and the sets are the list of links pointing to those pages. I tested it with my tripscan.com log file and included the results of a test-run.

Read more

The harder to get the link, the more valuable it is

Links that are too easy or relatively easy to get do not help much in getting traffic or authority for search engine rankings.

If your link is placed on a page where there are several hundred links competing for attention, it is less likely that potential visitors will click than if the page only has a few dozen links.

The value of your link source is in direct relation to how selective that source is when placing links on the page and how much traffic the source gets.  The value also declines with the number of links on the page.

Google is understood to use algorithms to measure the importance and quality of each page.  The PageRank was invented by Google founders and is used for measuring absolute importance of a page.  The TrustRank algorithm describes a technique for identifying trustworthy pages — quality pages.  We can not tell for sure to what extent Google is using this algorithm if at all, or at least their publicly known version.  What we can say, is that based on observation, we can definitely say that they do not treat all links equal and they do not pass authority to your page from all of your link sources.