From: Bob M.
Dear Jill,
Thanks for your great newsletter. It's one of the few,
perhaps the only one, of the many newsletters that I subscribe
to that I find regularly informative and useful. I especially
enjoy your informative rants, so perhaps I'll supply justification
for you to climb up on your soapbox and wax eloquent.
I'm relatively new to search engine optimization. Earlier
this year, when I knew less than I do now, and at the request
of upper management, I submitted our site (already listed)
to Google under a different URL. I was wary of this strategy
at the time, and have become more so the more I have learned.
I'm not surprised that a search of Google for the alias
URL produces no results. But I'm also concerned whether
Google may have taken punitive action regarding our existing,
established site/URL. One pretty good hint is that one keyword
that had produced a #4 Google rank last November now shows
nothing for us in the first 20 pages.
Two other bits of information complicate drawing any causal
inference: first, we completely redesigned the site in February
this year. (I've been careful to make the content keyword
rich and appropriate.) Second, I know Google gets search
results from Open Directory in addition to the information
gained from their own crawlers. Open's volunteer editor
policy means the updated information we sent them in February
has yet to show up in our listing with them (or, for that
matter, in Google's). Nonetheless, our keyword ranks in
Open are very good, in stark contrast to Google's.
My primary questions are whether our Google rankings are
being suppressed in response to our misguided attempt at
a duplicate listing, and, if so, how we can atone for our
sins and restore our
good standing with them.
Thanks for your weekly information in general, and any
help in particular.
Bob
++Jill's Response++
(Note: Bob didn't want his site to be mentioned in the
newsletter for obvious reasons, but it was supplied in his
original email.)
When I get these kinds of questions, the first thing I
do is check the site with my Google Toolbar turned on, so
I can see if the PageRank graph is grayed out or at zero.
If Google has imposed a penalty on a site, it's usually
evident by looking at the PageRank. (For more info on PageRank
and the Google Toolbar, please read my PageRank Summary
here: <http://www.rankwrite.com/archives/issue070.htm#seo>.
Gee...lots of Rank Write references today!)
So I plugged Bob's site into IE and saw that it had a respectable
PageRank of 5, which indicates that there's no penalty involved.
Next, I checked Google's cache of the page to see if they
were showing a blank page, or something other than the current
site. Strangely enough, Google had no record of the page
in its cache. So I checked the backward links, because usually
if it's not in the cache, there will also be no backward
links. However, there were *a lot* of backward links. So
things seemed stranger by the minute. The site was indeed
listed in DMOZ as Bob had stated, and also in Yahoo!. It's
got backward links and a good PageRank, so what could be
the problem?
It seemed to me that for some reason Google must not have
been able to spider the site. My first thought was that
the server may have been down when the Googlebot came a-crawlin'.
But then something else hit me. Perhaps Googlebot *couldn't*
spider the site. Perhaps it was excluded from crawling the
site through the robots.txt file.
For those who don't know what this is, it's a simple text
file that you can put on your server to exclude search engine
crawlers from accessing certain pages or directories of
your site. For instance, if you have password-protected
directories on your site with info that you don't want the
general public to get their hands on, you might exclude
crawlers using this file. (For more information on this,
please see: <http://www.robotstxt.org/>.)
So the next thing I needed to do was check out Bob's robots.txt
file. (To do that, you simply type in the domain name followed
by "/robots.txt" into your browser, e.g., bobsdomain.com/robots.txt.)
Here's what I found there:
User-agent: *
Disallow: /
Disallow: /Admin
Disallow: /Appraisal
Disallow: /Content
Disallow: /Custom
Disallow: /Images
Disallow: /Logon
Aha! There was the answer I suspected! Someone in Bob's
organization had put up a robots.txt file that excluded
ALL search engines from indexing ALL parts of his site!
(To be sure I was reading the file correctly, I checked
with a techie friend, who confirmed my suspicions.)
The moral of this story is that if your site is not showing
up in any given engine, the chances are that you are *not*
banned. It's actually very, very rare for engines to ban
or penalize sites. You have to be doing some pretty nasty
things for that to happen. It's extremely rare to be banned
by mistake or simply because you did something that you
didn't know would be considered spam. Those that get banned
for real almost always know *exactly* what they did wrong.
So don't just assume that you're banned if your site is
missing. Do some detective work and find out the real reason,
then fix it! I've seen other instances where the problem
had to do with misconfigured servers and IPs and other things
like that. Sometimes it's as simple as your site being down
when the bot tried to visit it. Just remember that you're
probably *not* banned. If you think you did something that
the engine might consider spam, then fix it and wait for
the next crawl.
If you never do anything even remotely shady, you won't
have to worry, now will you?
|