Friday, December 21, 2007

A Festivus for our webmasterus

If it's good enough for the Costanzas, it's good enough for Webmaster Central: it's time for a Festivus for the rest of us (webmasterus)!
Webmaster Central holiday photo
Our special celebration begins not with carols and eggnog, but by remembering some of the popular Webmaster Tools features -- make that Feats of Strength -- for 2007. This year, you gained the ability to chickity-check out your backlinks (<-- that's Festivus-inspired anchor text) and tell Google you want out with URL Removal. And let's not forget Message Center and IDNA support, perfect for those times when [a-zA-Z0-9\-] just doesn't cut it.

Feel the power! Festivus Feats of Strength!

Now comes our webmaster family's traditional Airing of Grievances. You can air your woes and "awww man!"s in the comments below. Just remember that bots may crawl this blog, but we humans review the comments, so please keep your grievances constructive. :) Let us know about features you'd like implemented in Webmaster Tools, articles you'd like written in our blog or Help Center, and stuff you'd like to see in the discussion group. Bonus points if you also explain how your suggestion helps the whole Internet—not just your site's individual rankings. (But of course, we understand that your site ranking number one for all queries in all regions is truly, objectively good for everyone.)

Last, there are so many Festivus Miracles to share! Such as the many helpful members of the discussion group from all around the world, the new friendships formed between Susan Moskwa, JohnMu, Wysz, Matt D, Bergy, Patrick, Nathanj and so many webmasters, and the fun of chatting with our video watchers, fellow conference attendees, and those in the blogosphere keepin' it real.

On behalf of the entire Webmaster Central team, here's to you, Festivus Miracle and Time Magazine's Person of the Year in 2006 -- happy holidays. See you in 2008. :)

Tuesday, December 18, 2007

The Ultimate Fate of Supplemental Results

In 2003, Google introduced a "supplemental index" as a way of showing more documents to users. Most webmasters will probably snicker about that statement, since supplemental docs were famous for refreshing less often and showing up in search results less often. But the supplemental index served an important purpose: it stored unusual documents that we would search in more depth for harder or more esoteric queries. For a long time, the alternative was to simply not show those documents at all, but this was always unsatisfying—ideally, we would search all of the documents all of the time, to give users the experience they expect.

This led to a major effort to rethink the entire supplemental index. We improved the crawl frequency and decoupled it from which index a document was stored in, and once these "supplementalization effects" were gone, the "supplemental result" tag itself—which only served to suggest that otherwise good documents were somehow suspect—was eliminated a few months ago. Now we're coming to the next major milestone in the elimination of the artificial difference between indices: rather than searching some part of our index in more depth for obscure queries, we're now searching the whole index for every query.

From a user perspective, this means that you'll be seeing more relevant documents and a much deeper slice of the web, especially for non-English queries. For webmasters, this means that good-quality pages that were less visible in our index are more likely to come up for queries.

Hidden behind this are some truly amazing technical feats; serving this much larger of an index doesn't happen easily, and it took several fundamental innovations to make it possible. At this point it's safe to say that the Google search engine works like nothing else in the world. If you want to know how it actually works, you'll have to come join Google Engineering; as usual, it's all triple-hush-hush secrets.*

* Originally, I was going to give the stock Google answer, "If I told you, I'd have to kill you." However, I've been informed by management that killing people violates our "Don't be evil" policy, so I'm forced to replace that with sounding mysterious and suggesting that good engineers come and join us. Which I'm dead serious about; if you've got the technical chops and want to work on some of the most complex and advanced large-scale software infrastructure in the world, we want you here.

Taking feeds out of our web search results

As a webmaster, you may have been concerned about your RSS/Atom feeds crowding out their associated HTML pages in Google's search results. By serving feeds, we could cause a poor user experience:
  1. Feeds increase the likelihood that users see duplicate search results.
  2. Users clicking on a feed may miss valuable content available only in the HTML page.
To address these concerns, we prevent feeds from being returned in Google's search results, with the exception of podcasts (feeds with multimedia enclosures). We continue to allow podcasts, because we noticed a significant number of them are standalone documents (i.e. no HTML page has the same content) or they have more complete item descriptions than the associated HTML page. However, if, as a webmaster, you'd like your podcasts to be excluded from Google's search results (e.g. if you have a vlog, its feed is probably a podcast), you can use Yahoo's spec for noindex feeds. If you use FeedBurner, making your podcast noindex is as simple as checking a box ("Noindex" under the "Publicize" tab).

As a user, you may ask yourself whether Google has a way to search for feeds. The answer is yes; both Google Reader and iGoogle allow searching for feeds to subscribe to.

We're aware that there are a few non-podcast feeds out there with no associated HTML pages, and thus removing these feeds for now from the search results might be less than ideal. We remain open to other feedback on how to improve the handling of feeds, and especially welcome your comments and questions in the Crawling, Indexing and Ranking subtopic of our Webmaster Help Group.

For the German version of this post, go to "Wir entfernen Feeds aus unseren Suchergebnissen."

Monday, December 17, 2007

Introducing Video Sitemaps

Written by John Fisher-Ogden, Software Engineer, and Amy Wu, Associate Product Manager

In our effort to help users search all the world's public videos, the Google Video team joined the Sitemaps folks to introduce Video Sitemaps—an extension of the Sitemap Protocol that helps make your videos more searchable via Google Video Search. By submitting this video-specific Sitemap in addition to your standard Sitemap, you can specify all the video files on your site, along with relevant metadata. Here's an example:

<urlset xmlns=""
<video:player_loc allow_embed="yes"></video:player_loc>
<video:title>My funny video</video:title>
<video:description>A really awesome video</video:description>

To get started, create a Video Sitemap, sign into Google Webmaster Tools, and add the Video Sitemap to your account.

Friday, December 14, 2007

FYI on Google Toolbar's latest features

The latest version of Google Toolbar for Internet Explorer (beta) just added a neat feature to help users arrive at your website, or at least see your content, even when things go awry.

It's frustrating for your users to mistype your URL and receive a generic "404 - Not Found" or try to access a part of your site that might be down.

Regardless of your site being useful and information-rich, when these issues arise, most users just move on to something else.  The latest release of Google Toolbar, however, helps users by detecting site issues and providing alternatives.

Website Optimizer or Website Optimiser? The Toolbar can help you find it even if you try "google.cmo" instead of "".

3 site issues detected by Google Toolbar

  1. 404 errors with default error pages
    When a visitor tries to reach your content with an invalid URL and your server returns a short, default error message (less than 512 bytes), the Toolbar will suggest an alternate URL to the visitor. If this is a general problem in your website, you will see these URLs also listed in the crawl errors section of your Webmaster Tools account.

    If you choose to set up a custom error page, make sure it returns result code 404. The content of the 404 page can help your visitors to understand that they tried to reach a missing page and provides suggestions regarding how to find the content they were looking for. When a site displays a custom error page the Toolbar will no longer provide suggestions for that site. You can check the behavior of the Toolbar by visiting an invalid URL on your site with the Google Toolbar installed.

  2. DNS errors
    When a URL contains a non-existent domain name (like, the Toolbar will suggest an alternate, similar looking URL with a valid domain name. 

  3. Connection failures
    When your server is unreachable, the Google Toolbar will automatically display a link to the cached version of your page. This feature is only available when Google is not explicitly forbidden from caching your pages through use of a robots meta tag or crawling is blocked on the page through the robots.txt file. If your server is regularly unreachable, you will probably want to fix that first; but it may also be a good idea to check the Google cache for your pages by looking at the search results for your site.

Suggestions provided by the Google Toolbar

When one of the above situations is found, the Toolbar will try to find the most helpful links for the user. That may include:
  • A link to the corrected URL
    When the Toolbar can find the most probable, active URL to match the user's input (or link they clicked on), it will display it right on top as a suggestion. The correction can be somewhere in the domain name, the path or the file name (the Toolbar does not look at any parameters in the URL).

  • A link to the cached version of the URL
    When Toolbar recognizes the URL in the Google cache, it will display a link to the cached version. This is particularly useful when the user can't access your pages for some reason. As mentioned above, Google may cache your URLs provided you're not explicitly forbidding this through use of a robots meta tag or the robots.txt file.

  • A link to the homepage or HTML site map of your site
    Sometimes going to the homepage or a site map page is the best way to find the page that a user is really looking for. Site map pages (these are not XML Sitemap files) are generally recognized based on the file name; if the Toolbar can find something called "sitemap.html" or similar, this page will probably be recognized as the site map page. Don't worry if your site map page is called something else; if a user decides to go to your homepage, they'll probably find it right away even if the Toolbar doesn't spot it.

  • A link to a higher level folder
    Sometimes the homepage or site map page is too far out and the user would be better off just going one step up in the hierarchy. When the Toolbar can recognize that your site's structure is based on folders and sub-folders, it may suggest a page one step back.

  • A search within your site for keywords found in the URL
    It's a good practice to use descriptive URLs. If the Toolbar can recognize keywords within the URL which the user tried to access, it will link to a site-search with those keywords. Even if the URL has changed significantly in the meantime, the search may be able to find similar content based on those keywords. For instance, if the URL was it will suggest a search for the words "party", "gifts" and "holidays" within the site

  • An open Google search box
    If all else fails, there's always a chance that similar content already exists elsewhere on the web. The Google web search can help your users to find it - the Toolbar will help you by adding the keywords found in the URL to the search box.

Are you curious already? Download the Google Toolbar for your browser and give it a try on your site!

To discuss how this feature can help visitors to your site, jump in to our Google Webmaster Help Group; or for general Google Toolbar questions, try the Toolbar group for Internet Explorer or the Toolbar group for Firefox.

Thursday, December 13, 2007

New: Content analysis and Sitemap details, plus more languages

We're always striving to help webmasters build outstanding websites, and in our latest release we have two new features: Content analysis and Sitemap details. We hope these features help you to build a site you could compare to a fine wine -- getting better and better over time.

Content analysis

To help you improve the quality of your site, our new content analysis feature should be a helpful addition to the crawl error diagnostics already provided in Webmaster Tools. Content analysis contains feedback about issues that may impact the user experience or that may make it difficult for Google to crawl and index pages on your site. By reviewing the areas we've highlighted, you can help eliminate potential issues that could affect your site's ability to be crawled and indexed. This results in better indexing of your site by Google and other search engines.

The Content analysis summary page within the Diagnostics section of Webmaster Tools features three main categories. Click on a particular issue type for more details:

  • Title tag issues
  • Meta description issues
  • Non-indexable content issues

content analysis usability section

Selecting "Duplicate title tags" displays a list of repeated page titles along with a count of how many pages contain that title. We currently present up to thirty duplicated page titles on the details page. If the duplicate title issues shown are corrected, we'll update the list to reflect any other pages that share duplicate titles the next time your website is crawled.

Also, in the Title tag issues category, we show "Long title tags" and "Short title tags." For these issue types we will identify title tags that are way too short (for example "IT" isn't generally a good title tag) or way too long (title tag was never intended to mean <insert epic novel here>). A similar algorithm identifies potentially problematic meta description tags. While these pointers won't directly help you rank better (i.e. pages with <title> length x aren't moved to the top of the search results), they may help your site display better titles and snippets in search results, and this can increase visitor traffic.

In the "Non-indexable content issues," we give you a heads-up of areas that aren't as friendly to our more text-based crawler. And be sure to check out our posts on Flash and images to learn how to make these items more search-engine friendly.

content analysis crawlability section

Sitemap details page

If you've submitted a Sitemap, you'll be happy when you see the additional information in Webmaster Tools revealing how your Sitemap was processed. You can find this information on the newly available Sitemap Details page which (along with information that was previously provided for each of your Sitemaps) shows you the number of the pages from your Sitemap that were indexed. Keep in mind the number of pages indexed from your Sitemap may not be 100% accurate because the indexed number is updated periodically, but it's more accurate than running a "" query on Google.

The new Sitemap Details page also lists any errors or warnings that were encountered when specific pages from your Sitemap were crawled. So the time you might have previously spent on crafting custom Google queries to determine how many pages from your Sitemap were indexed, can now be spent on improving your site. If your site is already the crème de la crème, you might prefer to spend the extra free time mastering your ice-carving skills or blending the perfect eggnog.

Here's a view of the new Sitemap details page:

Sitemaps are an excellent way to tell Google about your site's most important pages, especially if you have new or updated content that we may not know about. If you haven't yet submitted a Sitemap or have questions about the process, visit our Webmaster Help Center to learn more.

Webmaster Tools now available in Czech & Hungarian

We love expanding our product to help more people and in their language of choice. We recently put in effort to expand the number of Webmaster Tools available languages to Czech and Hungarian, in addition to the 20 other languages we already support. We won't be stopping here. Our desire to support even more languages in the future means that if your language of choice isn't currently supported, stay tuned -- there'll be even more supported languages to come.

We always love to hear what you think. Please visit our Webmaster Help Group to share comments or ask questions.

Thursday, December 6, 2007

Using ALT attributes smartly

Here's the second of our video blog posts. Matt Cutts, the head of Google's webspam team, provides some useful tips on how to optimize the images you include on your site, and how simply providing useful, accurate information in your ALT attributes can make your photos and pictures more discoverable on the web. Ms Emmy Cutts also makes an appearance.

Like videos? Hate them? Have a great idea we should cover? Let us know what you think in our Webmaster Help Group.

Update: Some of you have asked about the difference between the "alt" and "title" attributes. According to the W3C recommendations, the "alt" attribute specifies an alternate text for user agents that cannot display images, forms or applets. The "title" attribute is a bit different: it "offers advisory information about the element for which it is set." As the Googlebot does not see the images directly, we generally concentrate on the information provided in the "alt" attribute. Feel free to supplement the "alt" attribute with "title" and other attributes if they provide value to your users!

Tuesday, December 4, 2007

Answering more popular picks: meta tags and web search

Written by , Webmaster Trends Analyst, Zürich

In writing and maintaining accurate meta tags (e.g., descriptive titles and robots information), you help Google to more accurately crawl, index and return your site in search results. Meta tags provide information to all sorts of clients, such as browsers and search engines. Just keep in mind that each client will likely only interpret the meta tags that it uses, and ignore the rest (although they might be useful for other reasons).

Here's how Google would interpret meta tags of this sample HTML page:

<!DOCTYPE …><head>
<title>Traditional Swiss cheese fondue recipes<title>utilized by Google, accuracy is valuable to webmasters
<meta name="description" content="Cheese fondue is …">utilized by Google, can be shown in our search results
<meta name="revisit-after" content="14 days">not utilized by Google or other major search engines
<META name="verify-v1" content="e8JG…Nw=" />optional, for Google webmaster tools
<meta name="GoogleBot" content="noOdp">optional
<meta …>
<meta …>

<meta name="description" content="A description of the page">
This tag provides a short description of the page. In some situations this description is used as a part of the snippet shown in the search results. For more information, please see our blog post "Improve snippets with a meta description makeover" and the Help Center article "How do I change my site's title and description?" While the use of a description meta tag is optional and will have no effect on your rankings, a good description can result in a better snippet, which in turn can help to improve the quality and quantity of visitors from our search results.

<title>The title of the page</title>
While technically not a meta tag, this tag is often used together with the "description." The contents of this tag are generally shown as the title in search results (and of course in the user's browser when visiting the page or viewing bookmarks). Some additional information can be found in our blog post "Target visitors or search engines?", especially under "Make good use of page titles."

<meta name="robots" content="…, …">
<meta name="googlebot" content="…, …">
These meta tags control how search engines crawl and index the page. The "robots" meta tag specifies rules that apply to all search engines, the "googlebot" meta tag specifies rules that apply only to Google. Google understands the following values (when specifying multiple values, separate them with a comma):

The default rule is "index, follow" -- this is used if you omit this tag entirely or if you specify content="all." Additional information about the "robots" meta tag can be found in "Using the robots meta tag." As a side-note, you can now also specify this information in the header of your pages using the "X-Robots-Tag" HTTP header directive. This is particularly useful if you wish to fine-tune crawling and indexing of non-HTML files like PDFs, images or other kinds of documents.

<meta name="google" content="notranslate">
When we recognize that the contents of a page are not in the language that the user is likely to want to read, we often provide a link in the search results to an automatic translation of your page. In general, this gives you the chance to provide your unique and compelling content to a much larger group of users. However, there may be situations where this is not desired. By using this meta tag, you can signal that you do not wish for Google to provide a link to a translation for this page. This meta tag generally does not influence the ranking of the page for any particular language. More information can be found in the "Google Translate FAQ".

<meta name="verify-v1" content="…">
This Google webmaster tools-specific meta tag is used on the top-level page of your site to verify ownership of a site in webmaster tools (alternatively you may upload an HTML file to do this). The content value you put into this tag is provided to you in your webmaster tools account. Please note that while the contents of this meta tag (including upper and lower case) must match exactly what is provided to you, it does not matter if you change the tag from XHTML to HTML or if the format of the tag matches the format of your page. For details, see "How do I verify my site by adding a meta tag to my site's home page?"

<meta http-equiv="Content-Type" content="…; charset=…">
This meta tag defines the content-type and character set of the page. When using this meta tag, make sure that you surround the value of the content attribute with quotes; otherwise the charset attribute may be interpreted incorrectly. If you decide to use this meta tag, it goes without saying that you should make sure that your content is actually in the specified character set. "Google Webauthoring Statistics" has interesting numbers on the use of this meta tag.

<meta http-equiv="refresh" content="…;url=…">
This meta tag sends the user to a new URL after a certain amount of time, sometimes used as a simple form of redirection. This kind of redirect is not supported by all browsers and can be confusing to the user. If you need to change the URL of a page as it is shown in search engine results, we recommended that you use a server-side 301 redirect instead. Additionally, W3C's "Techniques and Failures for Web Content Accessibility Guidelines 2.0" lists it as being deprecated.

(X)HTML and Capitalization
Google can read both HTML and XHTML-style meta tags (regardless of the code used on the page). In addition, upper or lower case is generally not important in meta tags -- we treat <TITLE> and <title> equally. The "verify-v1" meta tag is an exception, it's case-sensitive.

revisit-after Sitemap lastmod and changefreq
Occasionally webmasters needlessly include "revisit-after" to encourage a search engine's crawl schedule, however this meta tag is largely ignored. If you want to give search engines information about changes in your pages, use and submit an XML sitemap. In this file you can specify the last-modified date and the change-frequency of the URLs on your site.

If you're interested in more examples or have questions about the meta tags mentioned above, jump into our Google Webmaster Help Group and join the discussion.

Update: In case you missed it, the other popular picks were answered in the Webmaster Help Group.

Saturday, December 1, 2007

Information about buying and selling links that pass PageRank

Our goal is to provide users the best search experience by presenting equitable and accurate results. We enjoy working with webmasters, and an added benefit of our working together is that when you make better and more accessible content, the internet, as well as our index, improves. This in turn allows us to deliver more relevant search results to users.

If, however, a webmaster chooses to buy or sell links for the purpose of manipulating search engine rankings, we reserve the right to protect the quality of our index. Buying or selling links that pass PageRank violates our webmaster guidelines. Such links can hurt relevance by causing:

- Inaccuracies: False popularity and links that are not fundamentally based on merit, relevance, or authority
- Inequities: Unfair advantage in our organic search results to websites with the biggest pocketbooks

In order to stay within Google's quality guidelines, paid links should be disclosed through a rel="nofollow" or other techniques such as doing a redirect through a page which is robots.txt'ed out. Here's more information explaining our stance on buying and selling links that pass PageRank:

February 2003: Google's official quality guidelines have advised "Don't participate in link schemes designed to increase your site's ranking or PageRank" for several years.

September 2005: I posted on my blog about text links and PageRank.

December 2005: Another post on my blog discussed this issue, and said

Many people who work on ranking at search engines think that selling links can lower the quality of links on the web. If you want to buy or sell a link purely for visitors or traffic and not for search engines, a simple method exists to do so (the nofollow attribute). Google’s stance on selling links is pretty clear and we’re pretty accurate at spotting them, both algorithmically and manually. Sites that sell links can lose their trust in search engines.

September 2006: In an interview with John Battelle, I noted that "Google does consider it a violation of our quality guidelines to sell links that affect search engines."

January 2007: I posted on my blog to remind people that "links in those paid-for posts should be made in a way that doesn’t affect search engines."

April 2007: We provided a mechanism for people to report paid links to Google.

June 2007: I addressed paid links in my keynote discussion during the Search Marketing Expo (SMX) conference in Seattle. Here's a video excerpt from the keynote discussion. It's less than a minute long, but highlights that Google is willing to use both algorithmic and manual detection of paid links that violate our quality guidelines, and that we are willing to take stronger action on such links in the future.

June 2007: A post on the official Google Webmaster Blog noted that "Buying or selling links to manipulate results and deceive search engines violates our guidelines." The post also introduced a new official form in Google's webmaster console so that people could report buying or selling of links.

June 2007: Google added more specific guidance to our official webmaster documentation about how to report buying or selling links and what sort of link schemes violate our quality guidelines.

August 2007: I described Google's official position on buying and selling links in a panel dedicated to paid links at the Search Engine Strategies (SES) conference in San Jose.

September 2007: In a post on my blog recapping the SES San Jose conference, I also made my presentation available to the general public (PowerPoint link).

October 2007: Google provided comments for a Forbes article titled "Google Purges the Payola".

October 2007: Google officially confirmed to Search Engine Land that we were taking stronger action on this issue, including decreasing the toolbar PageRank of sites selling links that pass PageRank.

October 2007: An email that I sent to Search Engine Journal also made it clear that Google was taking stronger action on buying/selling links that pass PageRank.

We appreciate the feedback that we've received on this issue. A few of the more prevalent questions:

Q: Is buying or selling links that pass PageRank a violation of Google's guidelines? Why?
A: Yes, it is, for the reasons we mentioned above. I also recently did a post on my personal blog that walks through an example of why search engines wouldn't want to count such links. On a serious medical subject (brain tumors), we highlighted people being paid to write about a brain tumor treatment when they hadn't been aware of the treatment before, and we saw several cases where people didn't do basic research (or even spellchecking!) before writing paid posts.

Q: Is this a Google-only issue?
A: No. All the major search engines have opposed buying and selling links that affect search engines. For the Forbes article Google Purges The Payola, Andy Greenberg asked other search engines about their policies, and the results were unanimous. From the story:

Search engines hate this kind of paid-for popularity. Google's Webmaster guidelines ban buying links just to pump search rankings. Other search engines including Ask, MSN, and Yahoo!, which mimic Google's link-based search rankings, also discourage buying and selling links.

Other engines have also commented about this individually, e.g. a search engine representative from Microsoft commented in a recent interview and said

The reality is that most paid links are a.) obviously not objective and b.) very often irrelevant. If you are asking about those then the answer is absolutely there is a risk. We will not tolerate bogus links that add little value to the user experience and are effectively trying to game the system.

Q: Is that why we've seen some sites that sell links receive lower PageRank in the Google toolbar?
A: Yes. If a site is selling links, that can affect our opinion about the value of that site or cause us to lose trust in that site.

Q: What recourse does a site owner have if their site was selling links that pass PageRank, and the site's PageRank in the Google toolbar was lowered?
A: The site owner can address the violations of the webmaster guidelines and submit a reconsideration request in Google's Webmaster Central console. Before doing a reconsideration request, please make sure that all sold links either do not pass PageRank or are removed.

Q: Is Google trying to tell webmasters how to run their own site?
A: No. We're giving advice to webmasters who want to do well in Google. As I said in this video from my keynote discussion in June 2007, webmasters are welcome to make their sites however they like, but Google in turn reserves the right to protect the quality and relevance of our index. To the best of our knowledge, all the major search engines have adopted similar positions.

Q: Is Google trying to crack down on other forms of advertisements used to drive traffic?
A: No, not at all. Our webmaster guidelines clearly state that you can use links as means to get targeted traffic. In fact, in the presentation I did in August 2007, I specifically called out several examples of non-Google advertising that are completely within our guidelines. We just want disclosure to search engines of paid links so that the paid links won't affect search engines.

Q: I'm aware of a site that appears to be buying/selling links. How can I get that information to Google?
A: Read our official blog post about how to report paid links from earlier in 2007. We've received thousands and thousands of reports in just a few months, but we welcome more reports. We appreciate the feedback, because it helps us take direct action as well as improve our existing algorithmic detection. We also use that data to train new algorithms for paid links that violate our quality guidelines.

Q: Can I get more information?
A: Sure. I wrote more answers about paid links earlier this year if you'd like to read them. And if you still have questions, you can join the discussion in our Webmaster Help Group.