Thursday, December 24, 2009

Helping webmasters from user to user

You have to have some kind of super-powers to keep up with all of the issues posted in our Webmaster Help Forum—that's why we call our Top Contributors the "Bionic Posters." They're able to leap through tall questions in a single bound, providing helpful and solid information all around. We're thankful to the Bionics for tackling problems both hard and easy (well, easy if you know how). Our current Bionic Posters are: Webado (Christina), Phil Payne, Red Cardinal (Richard), Shades1 (Louis), Autocrat, Tim Abracadabra, Aaron, Cristina, Robbo, John, Becky Sharpe, Sasch, BbDeath, Beussery (Brian), Chibcha (Terry), Luzie (Herbert), 奥宁 (Andy), Ashley, Kaleh and Redleg!

With thousands of webmasters visiting the English Help Forum every day, some questions naturally pop up more often than others. To help catch these common issues, the Bionic Posters have also helped to create and maintain a comprehensive list of frequently asked questions and their answers. These FAQs cover everything from "Why isn't my site indexed?" to diagnosing difficult issues with the help of Google Webmaster Tools, often referring to our Webmaster Help Center for specific topics. Before you post in the forum, make sure you've read through these resources and do a quick search in the forum; chances are high that your question has been answered there already.

Besides the Bionic Posters, we're lucky to have a number of very active and helpful users in the forum, such as: squibble, Lysis, yasir, Steven Lockey, seo101, RickyD, MartinJ and many more. Thank you all for making this community so captivating and—most of the time—friendly.

Here are just a few (well, a little more than a few) of the many comments that we've seen posted in the forum:

  • "Thank you for this forum... Thank you to those that take the time to answer and care!"
  • "I've only posted one question here, but have received a wealth of knowledge by reading tons of posts and answers. The time you experts put into helping people with their problems is very inspiring and my hat's off to each of you. Anyway, I just wanted to let you know that your services aren't going unnoticed and I truly appreciate the lessons."
  • "Thank you very much cristina, what you told me has done the trick. I really appriciate the help as this has been bugging me for a while now and I didn't know what was wrong."
  • "thank you ssssssssssoooo much kaleh. "
  • "OK, Phil Payne big thanks to You! I have made changes and maybe people are starting to find me in G! Thanks to Ashley, I've started to make exclusive and relevant content for people."
  • "If anything, it has helped me reflect on the sites and projects of days gone by so as to see what I could have done better - so that I can deliver that much more and better results going forward. I've learned that some things I had done right, were spot on, and other issues could have been handled differently, as well as a host of technical information that I've stored away for future use. Bottom Line: this forum rocks and is incredibly helpful."
  • "I asked a handful of questions, got GREAT help while doing a whole lot of lurking, and now I've got a site that rocks!! (...) Huge thanks to all the Top Contributors, and a very special mention to WEBADO, who helped me a TON with my .htaccess file."
  • "Over the years of reading (and sometimes contributing) to this forum I think it has helped to remove many false assumptions and doubts over Google's ranking systems. Contrary to what many have said I verily believe Google can benefit small businesses. Keep up the good work. "
  • "The forum members are awesome and are a most impressive bunch. Their contribution is immeasurable as it is huge. Not only have they helped Google in their success as a profitable business entity, but also helped webmasters both aspiring and experienced. There is also an engender(ment) of "family" or "belonging" in the group that has transcended the best and worst of times (Current forum change still TBD :-) ). We can agree, disagree and agree to disagree but remain respectful and civil (Usually :-) )."
  • "Hi Redleg, Thank you very much for all of the information. Without your help, I don't think I would ever have known how to find the problem. "
  • "What an amazing board. Over the last few days I have asked 1 question and recieved a ton of advice mainly from Autocrat. "
  • "A big thank you to the forum and the contributors that helped me get my site on Google . After some hassle with my web hosters and their naff submission service, issues over adding pages Google can see, issues over Sitemaps, I can now say that when I put my site name into the search and when i put in [custom made watch box], for instance, my site now comes up."
  • "Thank you Autocrat! You are MAGNIFICENT! (...) I am your biggest fan today. : ) Imagine Joe Cocker singing With a Little Help from My Friends...that's my theme song today."
  • "I've done a lot of reading since then and I've learned more in the last year than I learned in the previous 10. When I stumbled into this forum I had no idea what I was getting into but finding this forum was a gift from God! Words cannot express the amount of gratitude I feel for the help you have given me and I wish I could repay you some how.... I don't mean to sound so mushy, but I write this with tears in my eyes and I am truly, truly grateful..."

Are you new to the Webmaster Help Forum? Tell us a little bit about yourself and then join us to learn more and help others!

Tuesday, December 15, 2009

Handling legitimate cross-domain content duplication

Webmaster level: Intermediate

We've recently discussed several ways of handling duplicate content on a single website; today we'll look at ways of handling similar duplication across different websites, across different domains. For some sites, there are legitimate reasons to duplicate content across different websites — for instance, to migrate to a new domain name using a web server that cannot create server-side redirects. To help with issues that arise on such sites, we're announcing our support of the cross-domain rel="canonical" link element.

Ways of handling cross-domain content duplication:
  • Choose your preferred domain
    When confronted with duplicate content, search engines will generally take one version and filter the others out. This can also happen when multiple domain names are involved, so while search engines are generally pretty good at choosing something reasonable, many webmasters prefer to make that decision themselves.
  • Enable crawling and use 301 (permanent) redirects where possible
    Where possible, the most important step is often to use appropriate 301 redirects. These redirects send visitors and search engine crawlers to your preferred domain and make it very clear which URL should be indexed. This is generally the preferred method as it gives clear guidance to everyone who accesses the content. Keep in mind that in order for search engine crawlers to discover these redirects, none of the URLs in the redirect chain can be disallowed via a robots.txt file. Don't forget to handle your www / non-www preference with appropriate redirects and in Webmaster Tools.
  • Use the cross-domain rel="canonical" link element
    There are situations where it's not easily possible to set up redirects. This could be the case when you need to move your website from a server that does not feature server-side redirects. In a situation like this, you can use the rel="canonical" link element across domains to specify the exact URL of whichever domain is preferred for indexing. While the rel="canonical" link element is seen as a hint and not an absolute directive, we do try to follow it where possible.

Still have questions?

Q: Do the pages have to be identical?
A: No, but they should be similar. Slight differences are fine.

Q: For technical reasons I can't include a 1:1 mapping for the URLs on my sites. Can I just point the rel="canonical" at the homepage of my preferred site?
A: No; this could result in problems. A mapping from old URL to new URL for each URL on the old site is the best way to use rel="canonical".

Q: I'm offering my content / product descriptions for syndication. Do my publishers need to use rel="canonical"?
A: We leave this up to you and your publishers. If the content is similar enough, it might make sense to use rel="canonical", if both parties agree.

Q: My server can't do a 301 (permanent) redirect. Can I use rel="canonical" to move my site?
A: If it's at all possible, you should work with your webhost or web server to do a 301 redirect. Keep in mind that we treat rel="canonical" as a hint, and other search engines may handle it differently. But if a 301 redirect is impossible for some reason, then a rel="canonical" may work for you. For more information, see our guidelines on moving your site.

Q: Should I use a noindex robots meta tag on pages with a rel="canonical" link element?
A: No, since those pages would not be equivalent with regards to indexing - one would be allowed while the other would be blocked. Additionally, it's important that these pages are not disallowed from crawling through a robots.txt file, otherwise search engine crawlers will not be able to discover the rel="canonical" link element.

We hope this makes it easier for you to handle duplicate content in a user-friendly way. Are there still places where you feel that duplicate content is causing your sites problems? Let us know in the Webmaster Help Forum!

Friday, December 4, 2009

Your site's performance in Webmaster Tools

Webmaster level: Intermediate

Let's take a quick look at the individual sections in the Google Webmaster Tools' Site Performance feature:

Performance overview

The performance overview shows a graph of the aggregated speed numbers for the website, based on the pages that were most frequently accessed by visitors who use the Google Toolbar with the PageRank feature activated. By using data from Google Toolbar users, you don't have to worry about us testing your site from a location that your users do not use. For example, if your site is in Germany and all your users are in Germany, the chart will reflect the load time as seen in Germany. Similarly, if your users mostly use dial-up connections (or high-speed broadband), that would be reflected in these numbers as well. If only a few visitors of your site use the Google Toolbar, we may not be able to show this data in Webmaster Tools.

The line between the red and the green sections on the chart is the 20th percentile — only 20% of the sites we check are faster than this. This website is pretty close to the 20% mark, which pages would we have to work on first?

Example pages with load times

In this section you can find some example pages along with the average, aggregated load times that users observed while they were on your website. These numbers may differ from what you see as they can come from a variety of different browsers, internet connections and locations. This list can help you to recognize pages which take longer than average to load — pages that slow your users down.

As the page load times are based on actual accesses made by your users, it's possible that it includes pages which are disallowed from crawling. While Googlebot will not be able to crawl disallowed pages, they may be a significant part of your site's user experience.

Keep in mind that you may see occasional spikes here, so it's recommended that you watch the load times over a short period to see what's stable. If you consistently see very large load times, that probably means that most of your users are seeing very slow page loads (whether due to slow connections or otherwise), so it's something you should take seriously.

Page Speed suggestions

These suggestions are based on the Page Speed Firefox / Firebug plugin. In order to find the details for these sample URLs, we fetch the page and all its embedded resources with Googlebot. If we are not able to fetch all of embedded content with Googlebot, we may not be able to provide a complete analysis. Similarly, if the servers return slightly modified content for Googlebot than they would for normal users, this may affect what is shown here. For example, some servers return uncompressed content for Googlebot, similar to what would be served to older browsers that do not support gzip-compressed embedded content (this is currently the case for Google Analytics' "ga.js").

When looking at flagged issues regarding common third-party code such as website analytics scripts, one factor that can also play a role is how wide-spread these scripts are on the web. If they are common across the web, chances are that the average user's browser will have already cached the DNS lookup and the content of the script. While these scripts will still be flagged as separate DNS lookups, in practice they might not play a strong role in the actual load time.

We offer these suggestions as a useful guideline regarding possible first performance improvement steps and recommend using the Page Speed plugin (or a similar tool) directly when working on your website. This allows you to better recognize the blocking issues and makes it easy to see how modifications on the server affect the total load time.

For questions about Webmaster Tools and this new feature, feel free to read the Help Center article, search and post in the Webmaster Help Forums or in the Page Speed discussion group. We hope this information helps you make your website even faster!

Wednesday, December 2, 2009

How fast is your site?

We've just launched Site Performance, an experimental feature in Webmaster Tools that shows you information about the speed of your site and suggestions for making it faster.

This is a small step in our larger effort to make the web faster. Studies have repeatedly shown that speeding up your site leads to increased user retention and activity, higher revenue and lower costs. Towards the goal of making every webpage load as fast as flipping the pages of a magazine, we have provided articles on best practices, active discussion forums and many tools to diagnose and fix speed issues.

Now we bring data and statistics specifically applicable to your site. On Site Performance, you'll find how fast your pages load, how they've fared over time, how your site's load time compares to that of other sites, examples of specific pages and their actual page load times, and Page Speed suggestions that can help reduce user-perceived latency. Our goal is to bring you specific and actionable speed information backed by data, so stay tuned for more of this in the future.

screenshot of Site Performance

The load time data is derived from aggregated information sent by users of your site who have installed the Google Toolbar and opted-in to its enhanced features. We only show the performance charts and tables when there's enough data, so not all of them may be shown if your site has little traffic. The data currently represents a global average; a specific user may experience your site faster or slower than the average depending on their location and network conditions.

This is a Labs product that is still in development. We hope you find it useful. Please let us know your feedback through the Webmaster Tools Forum.

Update on 12/04/2009: Our team just reconvened to provide you more information on this feature. Check out JohnMu's latest post on Site Performance!

New User Agent for News

Webmaster Level: Intermediate

Today we are announcing a new user agent for robots.txt called Googlebot-News that gives publishers even more control over their content. In case you haven't heard of robots.txt, it's a web-wide standard that has been in use since 1994 and which has support from all major search engines and well-behaved "robots" that process the web. When a search engine checks whether it has permission to crawl and index a web page, the "check if we're allowed to crawl this page" mechanism is robots.txt.

Publishers could easily contact us via a form if they didn't want to be included in Google News but did want to be in Google's web search index. Now, publishers can manage their content in Google News in an even more automated way. Site owners can just add Googlebot-News specific directives to their robots.txt file. Similar to the Googlebot and Googlebot-Image user agents, the new Googlebot-News user agent can be used to specify which pages of a website should be crawled and ultimately appear in Google News.

Here are a few examples for publishers:

Include pages in both Google web search and News:
User-agent: Googlebot

This is the easiest case. In fact, a robots.txt file is not even required for this case.

Include pages in Google web search, but not in News:
User-agent: Googlebot

User-agent: Googlebot-News
Disallow: /

This robots.txt file says that no files are disallowed from Google's general web crawler, called Googlebot, but the user agent "Googlebot-News" is blocked from all files on the website.

Include pages in Google News, but not Google web search:
User-agent: Googlebot
Disallow: /

User-agent: Googlebot-News

When parsing a robots.txt file, Google obeys the most specific directive. The first two lines tell us that Googlebot (the user agent for Google's web index) is blocked from crawling any pages from the site. The next directive, which applies to the more specific user agent for Google News, overrides the blocking of Googlebot and gives permission for Google News to crawl pages from the website.

Block different sets of pages from Google web search and Google News:
User-agent: Googlebot
Disallow: /latest_news

User-agent: Googlebot-News
Disallow: /archives

The pages blocked from Google web search and Google News can be controlled independently. This robots.txt file blocks recent news articles (URLs in the /latest_news folder) from Google web search, but allows them to appear on Google News. Conversely, it blocks premium content (URLs in the /archives folder) from Google News, but allows them to appear in Google web search.

Stop Google web search and Google News from crawling pages:
User-agent: Googlebot
Disallow: /

This robots.txt file tells Google that Googlebot, the user agent for our web search crawler, should not crawl any pages from the site. Because no specific directive for Googlebot-News is given, our News search will abide by the general guidance for Googlebot and will not crawl pages for Google News.

For some queries, we display results from Google News in a discrete box or section on the web search results page, along with our regular web search results. We sometimes do this for Images, Videos, Maps, and Products, too. This is known as Universal search results. Since Google News powers Universal "News" search results, if you block the Googlebot-News user agent then your site's news stories won't be included in Universal search results.

We are currently testing our support for the new user agent. If you see any problems please let us know. Note that it is possible for Google to return a link to a page in some situations even when we didn't crawl that page. If you'd like to read more about robots.txt, we provide additional documentation on our website. We hope webmasters will enjoy the flexibility and easier management that the Googlebot-News user agent provides.

Tuesday, December 1, 2009

Region Tags in Google Search Results

Webmaster Level: All

Country-code top-level domains (or ccTLDs) can provide people with a quick and valuable clue about the location of a website—for example, ".fr" for France or "" for Japan. However, for certain top level domains like .com, .info and .org, it's not as easy to figure out the location. That's why today we're adding region information supplied by webmasters to the green address line on some Google search results.

This feature is easiest to explain through an example. Let's say you've heard about a boxing club in Canada called "Capital City Boxing." You try a search for [capital city boxing] to find out more, but it's hard to tell which result is the one you're looking for. Here's a screen shot:

None of the results provide any location information in the title or snippet, nor do they have a regional TLD (such as .ca for Canada). The only way to find the result you're looking for is to refine your search ([capital city boxing canada] works) or click through the various links to figure it out. Clicking through the first result reveals that there's apparently another "Capital City Boxing" club in Alabama.

Region tags improve search results by providing valuable information about website location right in the green URL line. Continuing our prior example, here's a screen shot of the new region tag (circled in red):

As you can see, the fourth result now includes the region name "Canada" after the green URL, so you can immediately tell that this result relates to the boxing club in Canada. With the new display, you no longer need to refine your search or click through the results to figure out which page is the one you're looking for. In general, our hope is that these region tags will help searchers more quickly identify which results are most relevant to their queries.

As a webmaster, you can control how this feature works by adjusting your Geographic Targeting settings. Log in to Webmaster Tools and choose Site configuration > Settings > Geographic Target. From here you can associate a particular country/region with your site. These settings will determine the name that appears as a region tag. You can learn more about using the Geographic Target tool in a prior blog post and in our Help Center.

We currently show region tags only for certain domains such as .com and .net where the location information would otherwise be unclear. We don't show region tags for results on domains like .br for Brazil, because the location is already implied by the green URL line in our default display. In addition, we only display region tags when the region supplied by the site owner is different from the domain where the search was entered. For example, if you do a search from the Singapore Google domain (, we won't show you region tags for all the websites webmasters have targeted to Singapore because we'd end up tagging too many results, and the tag is really most relevant for foreign regions. For the initial release, we anticipate roughly 1% of search results pages will include webpages with a region tag.

We hope you'll find this new feature useful, and we welcome your feedback.

Changes in First Click Free

Webmaster level: Intermediate

We love helping publishers make their content available to large groups of readers, and working on ways to make the world's information useful and accessible through our search results. At the same time, we're also aware of the fact that creating high-quality content is not easy and, in many cases, expensive. This is one of the reasons why we initially launched First Click Free for Google News and Google Web Search -- to allow publishers to sell access to their content in general while still allowing users to find it through our search results.

While we're happy to see that a number of publishers are already using First Click Free, we've found that some who might try it are worried about people abusing the spirit of First Click Free to access almost all of their content. As most users are generally happy to be able to access just a few pages from these premium content providers, we've decided to allow publishers to limit the number of accesses under the First Click Free policy to five free accesses per user each day. This change applies to both Google News publishers as well as websites indexed in Google's Web Search. We hope that this encourages even more publishers to open up more content to users around the world!

Questions and answers about First Click Free

Q: Do the rest of the old guidelines still apply?
A: Yes, please check the guidelines for Google News as well as the guidelines for Web Search and the associated blog post for more information.

Q: Can I apply First Click Free to only a section of my site / only for Google News (or only for Web Search)?
A: Sure! Just make sure that both Googlebot and users from the appropriate search results can view the content as required. Keep in mind that showing Googlebot the full content of a page while showing users a registration page would be considered cloaking.

Q: Do I have to sign up to use First Click Free?
A: Please let us know about your decision to use First Click Free if you are using it for Google News. There's no need to inform us of the First Click Free status for Google Web Search.

Q: What is the preferred way to count a user's accesses?
A: Since there are many different site architectures, we believe it's best to leave this up to the publisher to decide.

(Please see our related blog post for more information on First Click Free for Google News.)