Right after I finished writing a post about how being likeable is a great business strategy, I went back to Sphinn and saw it erupted with controversy and negative feedback about SEOmoz’s Linkscape. Since then threads have been open, closed, and open. People are worried about everything from the index size to how to remove your site to why you shouldn’t label your site with an obvious SEO footprint.
So my timing on that last post was a bit off, but I still think the general thesis is valid. But now that there has been so much negative feedback I figure it is my job to play devil’s advocate and highlight reasons why most SEOs do not need to be too worried about Linkscape.
Unique Linking Domains
One of the coolest features of this tool is knowing the number of unique linking domains pointing links at a specific site, but that feature is for paying members only.
A competing tool by the name of Majestic SEO allows you to see that data as part of their free overview. Click on the image below for an example.
If your competitor has high authority links then you need more than just quantity to compete, but if most of their backlinks are garbage then this is a good stat to have, along with many other stats you can get from tools like SEO for Firefox.
Not that I advocate spam reporting (as the official guidelines have departed from reality so much that almost everyone that ranks is spamming and/or spammed in the past to get to their current market position), but for professional SEOs that own dozens of sites and like doing spam reports to Google this might be a good tool for outing competitors, since it makes it easy to find some noscript links, links from off the page, inbound 301 redirects, but the average webmaster probably does not need to worry about that.
A Bit Top Heavy
One of the biggest limitations in Linkscape is that you can only go 500 results deep unless you want to buy a custom report. They allow you to see various lenses of 500 at a time through search features and filters, but a big recommendation I can make on this front is for them to allow you to see all that data, even if it requires exporting data to CSV…they already spent the money to collect the data, so if your a customer they may as well give it to you…it helps nobody if nobody sees it.
Majestic SEO appears to have a similar sized database as Linkscape, and they allow you to do a full data export for your own domain free of charge. Other domains they charge a scaling price for depending on the number of links to the domain.
More Cool Features?
Nick Gerner promised more features in the next version of Linkscape, but unless they start buying usage data and become more like Compete.com I am not sure if it will be a game changer. On to explaining why…
1. Editorial Rules
When Linkscape was announced Danny Sullivan said:
Personally, I’m not too worried. You want to compete with me and get links in places where I’m listed? We get listed in places where editorial rules. So just knowing where we’re at doesn’t get you in the door — you have to be good enough to walk in. And if you are good enough, well, good I guess.
The highest quality links typically tend to be editorial in nature, with many of those being driven by social relationships. No matter how much one decides to analyze link patterns, they can’t re-create most of the link relationships if they don’t already have the content quality, market exposure, and awareness. And if you copy someone’s idea after they already did it you need to greatly improve upon it to get credit for it.
2. Tons of Alternative Data Sources
Common link analysis questions…
How do I Get a Basic Competitive Overview of the Search Results?
Search Google with SEO for Firefox turned on. Make sure you are pulling data in the automatic mode while searching.
I Want to do Anchor Text Analysis. How do I Analyze Links?
Some options include…
- SEO Link Analysis – a free Firefox extension that adds anchor text to Google Webmaster Central and Yahoo! Site Explorer.
- Link Diagnosis – another useful Firefox extension.
- Link Analysis Tool – shows the PageRank and number of inlinks to each page on a site, though it requires you to set up a MySQL database.
- Both Google Webmaster Central and Majestic SEO allow you to download backlink profiles for your own sites after you authenticate your sites.
- Backlink Analyzer – a free desktop based tool I had created a few years ago that pulls data from the Yahoo! API. Make sure to watch the video on the download page before using it.
I Want to Find New Links to Competing Sites
If you want to find what someone’s best ideas are all you have to do is subscribe to the Google Blogsearch feed for links to their site, like so. That should list many of the people who are talking about this site.
A paid option on this front is Advanced Link Manager. It costs $199 (or $299 if you package it with Advanced Web Ranking) and scrapes data from Yahoo!, keeping track of the date when the link was found.
I Want to Find New Links to My Site
This is the same as competing sites, but you can also use your web analytics and server logs to dig up additional information. You can also look inside Google Webmaster Central to download backlink reports.
I Want to Find The Most Authoritative Links Pointing at a Site
Yahoo! Site Explorer generally orders backlinks roughly in terms of authority, with some of the most authoritative backlinks showing up at the top of their results.
I Want to Find .edu Links
Yahoo! Search offers a wide array of advanced link operators. Here are .edu & .gov links pointing at searchengineland.com.
I Want to Get an Estimate of Unique Linking Domains
Majestic SEO offers a free estimate…though, like LinkScape, their crawl is not as comprehensive as Yahoo!’s.
I Want to Find Hub Links?
- Hub Finder is a great tool for finding topical hubs.
- Google TouchGraph is another great option with a cool graphical interface.
What Sites Drive the Most Traffic to My Competitors?
The best way I have found to get this data is from Compete.com Referral Analytics, though it requires a $500 a month subscription…which is a nice chunk of change, unless you are already doing quite well!
Do I Have Any Broken Links?
- Xenu Link Sleuth will crawl your site and help you find broken links, showing you which pages the broken links were on.
- Google Webmaster Central offers broken link reports.
3. All Link Graphs Are Unique
Each search engine has it’s own crawling priorities and own web graph. Google has probably spent hundreds of millions of dollars building and refining their crawling sequence. No two crawls are the same. Further, the Linkscape system deviates from how most major search engines treat the robots noindex meta tag. Danny Sullivan stated:
SEOmoz will treat noindex also as nofollow — it won’t follow links on pages that you’ve blocked from being indexed.
When Matt Cutts was interviewed by Eric Enge last year he stated:
The NoIndex and NoFollow metatags are independent. The NoIndex metatag, for Google at least, means don’t show this page in Google’s index. The NoFollow metatag means don’t follow the outgoing links on this entire page.
Image from Google Touchgraph.
4. Yahoo! Search Counts Link Weight Differently Based on Page Segmentation
Google’s PageRank was designed based on a random walk theory, where browsers click a random link on the page. But search engines are looking to move beyond the random walk model.
Yahoo! Search’s Priyank Garg stated:
The irrelevant links at the bottom of a page, which will not be as valuable for a user, don’t add to the quality of the user experience, so we don’t account for those in our ranking. All of those links might still be useful for crawl discovery, but they won’t support the ranking.
5. Microsoft May be Looking to Heavily Incorporate Usage Data
Microsoft did research on BrowseRank, which aims to use actual usage data to augment (or perhaps replace) their link graph. Be default, Internet Explorer 8 sends usage data to Microsoft…when you know what 80% of web users are doing you do not need to rely on a random walk.
Think of having access to the majority of the web’s usage data like this:
- If Google’s algorithms are more relevant than Microsoft, then putting weight on usage data allows Microsoft to quickly catch up by weighting whatever Google is weighting
- Microsoft could theoretically be better than Google at filtering out paid links, as most paid links in a sidebar or footer do not send much traffic…and thus could easily be weighted less than links in content – though with Google owning so many products they could improve significantly on this front as well, if they decided to use their AdSense data, analytics data, Chrome browser data, Feedburner data, and toolbar data.
6. Google Does a Lot of Hand Editing
Beyond those editors there are many search engineers inside the webspam team offering a variety of techniques to throw off SEOs, including
- stripping all PageRank from a site and killing all its rankings
- stripping some portion of a site’s PageRank and ranking abilities
- stripping PageRank from the toolbar but still allowing sites to rank
- showing full PageRank in the toolbar, but killing the ability of a link to pass PageRank
Without working inside of Google and/or buying and testing lots of links across a wide array of sites and verticals it would be hard to know if any particular site passes PageRank, and how much it might pass. For instance, a link from Text-Link-Ads.com’s website is one of my highest MozRank links, but I doubt Google places much weight on that link since Google does not let Text Link Ads rank for their own brand.
Read Eric Schmidt’s perspective on brands to consider how Google holds different sites to different standards.
7. Search Engine Editorial Policies are Selective, & Constantly Changing
According to Udi Manber, Google did 450 search algorithm updates last year. Even if you could somehow catch up with all the editorial stuff search engines were doing to manipulate their version of the link based web graph, you would have a hard time of keeping up with it – let alone accounting for the hoards of usage data the search engines have.
The status of a link (and its ability to pass PageRank) may arbitrarily change based on media exposure. In the past many websites were hijacked by 302 affiliate links (this even happened to Google’s site, and this is still happening today to corporate sites as big as Snapnames).
At an SEO conference about 3 or 4 months back someone highlighted that some large sites use 301 redirects on affiliate links. This topic came up once again at SMX East, where it was deemed an acceptable marketing practice:
Shockingly, when asked point blank if affiliate programs that employed juice-passing links (those not using nofollow) were against guidelines or if they would be discounted, the engineers all agreed with the position taken by Sean Suchter of Yahoo!. He said, in no uncertain terms, that if affiliate links came from valuable, relevant, trust-worthy sources – bloggers endorsing a product, affiliates of high quality, etc. – they would be counted in link algorithms. Aaron from Google and Nathan from Microsoft both agreed that good affiliate links would be counted by their engines and that it was not necessary to mark these with a nofollow or other method of blocking link value.
A few years ago I set up my affiliate program to use 301 redirects to prevent hijacking, and get any link benefits I could. But right after I changed by business model to a membership site my affiliate program was featured/outed in this interview, and it no longer passes PageRank.
Watch the above video and see how at 2 minutes and 15 seconds in my site was put up for review to any Google engineer that happened to watch it.
The same set of links, to the same site, using the same format, under similar circumstances…
- counts for most major corporations (and is allegedly an approved and legitimate strategy)
- counted for this site for years
- stopped counting around the time they were outed by a popular SEO blogger
8. Temporal Algorithms + Domains Expire, & May Lose PageRank
Search engines may place weight not only on the number of links pointing at a page, but also on the rate at which links are accumulated. Even if you know the raw number of links and the site age it still does not tell you how many links were built last month or in the last year.
Not only are links born, but some of them rot. The web graph as a whole is over a decade old. Linkrot was a big issue in 1998, and it is still a big issue today. In 1998 6% of links were broken, and the DotBot crawl shows 7% of links being broken.
To appreciate how bad linkrot is…
- if you publish a large site with many outbound links, run Xenu Link Sleuth through your site and see how many broken links you find
- if you run a large complex portal, check your error logs or sign up for Google Webmaster Tools and discover how many broken links you have in your site
Some domains that expire may keep their PageRank, but many expiring domains lose their PageRank. With how hard it is to build links today and 1 in 7 links broke there are SEO tools designed around trying to capture this link equity
The domains that die off may later be re-registered and re-purposed. And keep in mind that the 1 in 7 broken links number is actually much higher than that when you consider how many people buy expired domain names and build them out.
By creating an index of the web in 2008 a person would have no idea if…
- the links occurred recently
- if the links are old
- if the site expired and potentially lost much of its link weight
And Matt Cutts generally hates re-purposing expired domain names. Why? The very first spam site he found was a high PageRank expired domain linked from the W3C. That site was converted to a porn site, and ever since then (before Matt was the head of the webspam group – before Google even had a webspam group) Matt has not liked expired domains.
Matt offers background on that story 30 seconds into this video:
9. Advancing Algorithms That Move Away From PageRank & Anchor Text
Paid links have been an obvious weak spot in the relevancy algorithms for years. PageRank and anchor text are still both important, but Google also considers other factors like…
- domain age / link age
- domain name (and extension)
- domain history (ie: spam infractions/penalties, etc.)
- site authority
- signals of locality (hosting location, TLD, link sources, etc.)
- searcher intent (Google’s Amit Singhal stated “the same query can mean entirely different things in different countries. For example, [Côte d’Or] is a geographic region in France – but it is a large chocolate manufacturer in neighboring French-speaking Belgium”)
- other forms of search personalization (past searches, user subscriptions, frequently visited sites, etc.)
- editorial partnerships with news companies & other universal search categories (like Google Shopping Search and the maps local onebox)
- usage data (especially with sites they host, like YouTube)
- content age (read up on the Query Deserves Freshness algorithm)
Look at some of the search results from Google’s 2001 index and compare them to current search results to see how much Google has moved away from a raw PageRank model. Yahoo! Search’s Priyank Garg also stated that they have moved away from placing so much weight on links:
All of those links might still be useful for crawl discovery, but they won’t support the ranking. That’s what we are constantly looking at in algorithms. I can tell you one thing, that over the last few years as we have been building out our search engine and incorporating lots of data, the absolute percentage contribution of links and anchor text to the natural ranking of algorithms or to the importance in our ranking algorithms has gone down somewhat.
It is not that Linkscape is a bad tool, it is just aiming to do something incredibly complex, and as long as Yahoo! Site Explorer gives us a decent free sample (and other tools let us layer data on top of Yahoo!) we can get a good idea of the approximate level of competition for free. But with Yahoo! at $12 a share, if Yahoo! gets bought out and Site Explorer goes away then Linkscape (or Majestic SEO, depending on who does a better job of innovation) might be one of the best SEO investments one can make.
More: continued here