January 10, 2007

Copyright and Google Book Search

Question: does reprinting and republication in a new format give the publisher rights to control the use of a public domain work?

Answer: not under current copyright law!

But Google would like it to. In their Google Book Search, they’ve included text placing restrictions on use of the books which they’re making available. It’s not necessarily legal: but what is written can frequently be accepted as law by any user without knowledge of the laws surrounding copyright.

Philipp Lenssen is challenging this position. Good luck, Philip!

December 19, 2006

Leverage Google’s Sitemaps for Your Business

I read questions about Google’s Sitemaps pretty regularly. Generally, they follow this general shape: “Should I use Google’s Sitemaps to get my site indexed/improve my rankings/escape the sandbox?” My answer is pretty much always the same: No. To put it simply, most people are just asking the wrong question. The value of Google’s Webmaster Central resources, and particularly the Sitemap protocol (which is now an accepted shared format for Google, Yahoo! and Live,) is information.

There are no automatic benefits to Sitemaps - this isn’t the infamous mass search engine submission of the late ’90’s. If you create a Sitemap and tell a search engine about it, they’ll happily crawl it. They’ll eagerly learn what you want to tell them about your site: what pages you consider most important, how frequently that page might be updated and when it was most recently updated. Having learned this, they might weight that factor in consider what pages to crawl and when. And, although I can’t say this with any authority, they won’t add your site to the Google Index just because you’ve submitted a Sitemap.

The Sitemap is more for you and your business than it is for Google. If you take a few minutes to look at the information that Google will give you about your site you can learn incredibly valuable information for your business.

What information can you learn from Google Sitemaps?

  1. Crawl Data.

    When was your site last visited? Does your site have pages included in Google’s Index? What kind of errors has the Googlebox found? Talk about troubleshooting - has your traffic dropped abruptly? Well, maybe you should log in to Sitemaps and see what’s up. They might even tell you whether you’ve violated their guidelines - in which case you can immediately correct the problem and beg forgivenessrequest reinclusion.

    Maybe you’ve caused the problem yourself: have you blocked Google from your site using your Robots.txt file? Whoops! Google will tell you, and even allow you to actively test different versions of robots.txt so you can determine what you need to change.

  2. Query Statistics.

    A search marketing campaign lives and fights on a diet of statistics. All of these statistics tell you valuable information: and Google is willing to provide you with a handful of very valuable information through the Webmaster Central console. If you’re interested in PageRank, for example, you can see how PageRank is distributed amongst your pages. And if you think that the trifold distribution graphs are too simplistic: well, PageRank is already a simplified system. Don’t worry about it.

    More importantly, you’ve got access to Query stats. What could possibly be more valuable than information from Google explicitly telling you a) what search terms have brought your site into the search results, b) what rank in the search results pages you had for that term, c) what search terms actually brought traffic to your site and d) what position in search results pages you had for those terms.

    Maybe you don’t know how these could be useful: but this is some of the most valuable information you’ll find anywhere. You’ll learn what terms are going to show your site; and you’ll learn what terms will actually bring traffic. You can download this information in a variety of formats and track it to keep a very close look on your site’s behavior. This can easily be the first indicator you’ll have that something is changing in your site’s indexing and ranking.

  3. Page Analysis

    You can easily find out what the most common words are in your site. Google tells you this; but there are many other ways of getting that informations. It is, however, much more difficult to learn what the most common words appear in the text of links pointing to your site. Google will supply that information. At the moment, at least, they aren’t giving rich statistics: just a list in no obvious order which states the most common terms found. Nonetheless, if you’re not finding a good match between common link texts and your site’s content, you might need to think about how to build more valuable links. It’s information, and you can use it.

But this is just a sample of what’s available from Google Webmaster Central: setting the speed with which your site will be crawled by Googlebot, choosing your domain suffix preference (www or non-www), joining the enhanced image search program: all possibilities from Webmaster Central. The service is changing rapidly - to follow Sitemaps updates you’ll want to stay tuned to the Google Webmaster Central Blog, where Vanessa Fox and her team provide news covering uses of Webmaster Central as well as new features and processes they’re offering.

December 18, 2006

Adam Lasnik on Duplicate Content

One phrase which was repeatedly emphasized during the first couple of days at SES Chicago was “If you have questions about duplicate content, Adam Lasnik is speaking at a session dedicated entirely to that discussion.” This frequently came at the beginning of Q & A sessions following a series of speakers who talked about any aspect of search indexing. Just a hunch, but I suspect the intent was to reduce the number of people asking questions about duplicate content during sessions on other subjects.

Personally, I didn’t go to the duplicate content session. Hopefully this repeated statement did actually cause numerous question-asking individuals to attend that session. It certainly didn’t prevent people from asking about duplicate content in other sessions, however.

Regardless, if you feel like you missed out on learning about duplicate content, have no fear. Adam has just published his thoughts about duplicate content at the Google Webmaster Central blog. I’m going to guess that people who attended SES will have learned more than he’s gone through in this post. Still, the post is definitely a good summary of duplicate content issues and what to do about them. (At least as far as Google’s concerned.)

Filed under: Google, Search (General)

« Previous Page | Next Page »