I have never been a huge fan of correlation analysis. The reason being is that how things behave in aggregate may not have anything to do with how they would behave in your market for your keywords on your website.

Harmful High Quality Links?

A fairly new website was ranked amazingly quickly on Google.com for a highly competitive keyword. It wasn’t on the first page, but ranked about #20 for a keyword that is probably one of the 100 most profitable keywords online (presuming you could get to a #1 ranking above a billion Dollar corporation). The site did a promotion that was particularly well received by bloggers and a few bigger websites in the UK press and at first rankings improved everywhere. Then one day while looking at its rankings using rank checker I saw the site simply fell off the map. It was nowhere. I then jumped into web analytics and saw search traffic was up. What happened was Google took the site as being from the UK, so its rankings went to page 1 in the UK while the site disappeared from the global results. In aggregate we know that more links are better & links from high trusted domains are always worth getting. And yet in the above situation the site was set back by great links. Of course we can set the geographic market inside Google Webmaster Tools to the United States, but how long will it take Google to respond? How many other local signals will be fixed to pull the site out of the UK?

Over time those links will be a net positive for the site, but it still needs to develop more US signals. And beyond those sort of weird things (like links actually hurting your site) the algorithms can look for other signals to push into geotargeting. Things like Twitter mentions, where things are searched for, how language is used on your website, and perhaps even your site’s audience composition may influence localization. What is worse about some of these other signals is that they may mirror media coverage. If you get coverage in The Guardian a lot of people from the UK will see it, and so you might get a lot of Tweets mentioning your website that are from the UK as well. In such a way, many of the signals can be self-reinforcing even when incorrect.

Measuring The Wrong Thing

Another area where correlation analysis falls short is when one page ranks based on the criteria earned by another. Such signal bleeding means that if you are looking at things in aggregate you are often analyzing data which is irrelevant.

Sampling Bias

Correlation analysis also has an issue of sampling bias. People tend to stick with defaults until they learn enough to change. Unfortunately most CMS tools are set up in sub-optimal ways. If you look at the top ranked results some of the sub-optimal set ups will be over-represented in the “what works” category simply because most websites are somewhat broken. The web is a fuzz test.

Of course the opposite of the above is also true: some of the best strategies remain hidden in plain sight simply due to sheer numbers of people doing x poorly.

Analyzing Data Pairs Rather Than Individual Signals

Another way signals have blurred is how Google uses page titles in the search results. That generally used to be just the page title. But more recently they started mixing in

  • using an on-page heading rather than the page title (when they feel the on-page heading is more relevant)
  • adding link anchor text into the title (in some cases)
  • adding the homepage page’s title at the end of sub-pages (when sub-page page titles are short)

As Google adds more signals & changes how they account signals it makes analyzing what they are doing much harder. You not only need to understand how the signals are used, but how they interact in pairs or groups. When Google uses the H1 heading on a page to display in the search results are they still putting a lot of weight on the page title? Does the weighting on the H1 change depending on if Google is displaying it or not?

Analysis is Still Valuable, but…

I am not saying that analysis is a waste of time, but rather that when you do it lots of do’s and don’ts become far less concrete. The fact is that there are always edge cases that disprove any rule of thumb. Rather than looking for general rules one needs to balance things like:

  • risk vs reward
  • yield vs effort
  • focus vs diversity
  • investment vs opportunity cost

First Mover Advantage

Along the same lines, any given snapshot of search is nowhere near as interesting as understanding historical trends and big shifts. If you are one of the first people to notice something there is far more profit potential than being late to the party. Every easily discernible signal Google creates eventually gets priced close to (or sometimes above) true market value. Whereas if you are one of the first people to highlight a change you will often be called ignorant for doing so. 😀

Consensus is the opposite of opportunity.

When you do correlation analysis you are finding out when the market has conformed to what Google trusts & desires. Exact match domains were not well ranked across a wide array of keywords until after Google started putting more weight on them & people realized it. But if there is significant weight on them today & their prices are sky high then knowing that they carry some weight might not be a real profit potential in your market. It might even be a distraction or a dead end. Imagine being the person who bets (literally) a million Dollars that Google will place weight on poker.org only to find out that Google changes their algorithmic approach & weighting, or makes a special exception just for your site (as they can & have done). That day would require some tequila.

As a marketing approach becomes more mainstream then not only do the cost rise, but so does the risk of change. As people complain about domain names (or any other signal or technique) it makes Google more likely to act to curb the trend and/or lower it’s weighting & value. To see an extreme version of such, consider that the past year has seen lots of complaints about content farms. A beautiful quote:

Searching Google is now like asking a question in a crowded flea market of hungry, desperate, sleazy salesmen who all claim to have the answer to every question you ask.

And so Google promises action. Don’t make Google look stupid!

History Holds the Key for Success

The only way to understand to profitably predict the future is to accurately understand history.

  • “Our ignorance of history makes us libel our own times. People have always been like this.” – Gustave Flaubert
  • “History repeats itself, first as tragedy, second as farce.” – Karl Marx
  • “We are the prisoners of history. Or are we?” – Robert Penn Warren, Segregation

More: continued here