Tuesday, February 27, 2007

Traveling Down Under: GWC at Search Engine Room and Search Summit Australia

G'day Webmasters! Google Webmaster Central is excited to be heading to Sydney for Search Summit and Search Engine Room on March 1-2 and 20-21, respectively.

In addition to our coverage of topics in bot obedience and site architecture, we'll also provide a clinic for building Sitemaps, and chances to "chew the fat" with the Aussies in the "Google Breakfast" and "Google Webmaster Central Q&A." Our Search Evangelist, Adam Lasnik, will lead a fun session in "Living the Non 9-5 Life, Tips for Achieving Balance, Sanity...", where mostly, we hope to learn from you.

Search Summit

Thursday, March 1st
Site Architecture, CSS and Tableless Design 14:45 - 15:30
Peeyush Ranjan, Engineering Manager

Friday, March 2nd
Bot Obedience 09:45 - 10:00
Dan Crow, Product Manager, Crawl Systems

Web 2.0 & Search 11:15 - 12:00
Dan Crow, Product Manager, Crawl Systems

Google Linking Clinic 12:00 - 12:45
Adam Lasnik, Search Evangelist

Lunch with Google Webmaster Central 12:45 -13:30

Sitemap Clinic 13:30 - 14:15
Maile Ohye, Developer Support Engineer

Google Webmaster Central Q&A 14:15 - 15:00

Living the Non 9-5 Life, Tips for Achieving Balance, Sanity... 15:00 - 15:45
Adam Lasnik, Search Evangelist

Search Engine Room

Tuesday, March 20th
Google Breakfast 07:30 - 09:00
Aaron D'Souza, Software Engineer, Search Quality

Don't Be Evil 09:30 - 10:30
Richard Kimber, Managing Director of Sales and Operations

Monday, February 26, 2007

Better badware notifications for webmasters

In the fight against badware, protecting Google users by showing warnings before they visit dangerous sites is only a small piece of the puzzle. It's even more important to help webmasters protect their own users, and we've been working on this with A few months ago we took the first step and integrated malware notifications into webmaster tools. I'm pleased to announce that we are now including more detailed information in these notifications, and are also sending them to webmasters via email.

Webmaster tools notifications
Now instead of simply informing webmasters that their sites have been flagged and suggesting next steps, we're also showing example URLs that we've determined to be dangerous. This can be helpful when the malicious content is hard to find. For example, a common occurrence with compromised sites is the insertion of a 1-pixel iframe causing the automatic download of badware from another site. By providing example URLs, webmasters are one step closer to diagnosing the problem and ultimately re-securing their sites.

Email notifications
In addition to notifying webmaster tools users, we've also begun sending email notifications to some of the webmasters of sites that we flag for badware. We don't have a perfect process for determining a webmaster's email address, so for now we're sending the notifications to likely webmaster aliases for the domain in question (e.g., webmaster@, admin@, etc). We considered using whois records, but these often contain contact information for the hosting provider or registrar, and you can guess what might happen if a web host learned that one of its client sites was distributing badware. We're planning to allow webmasters to provide a preferred email address for notifications through webmaster tools, so look for this change in the future.

Update: For more information, please see our Help Center article on malware and hacked sites.

Tuesday, February 20, 2007

Tips on using feeds and information on subscriber counts in Reader

Does your site have a feed? A feed can connect you to your readers and keep them returning to your content. Most blogs have feeds, but increasingly, other types of sites with frequently changing content are making feeds available as well. Some examples of sites that offer feeds:
Find out how many readers are subscribed to your feed
If your site has a feed, you can now get information about the number of Google Reader and Google Personalized Homepage subscribers. If you use Feedburner, you'll start to see numbers from these subscriptions taken into account. You can also find this number in the crawling data in your logs. We crawl feeds with the user-agent Feedfetcher-Google, so simply look for this user-agent in your logs to find the subscriber number. If multiple URLs point to the same feed, we may crawl each separately, so in this case, just count up the subscriber numbers listed for each unique feed-id. An example of what you might see in your logs is below:

User-Agent: Feedfetcher-Google; (+; 4 subscribers; feed-id=1794595805790851116)

Making your feed available to Google
You can submit your feed as a Sitemap in webmaster tools. This will let us know about the URLs listed in the feed so we can crawl and index them for web search. In addition, if you want to make sure your feed shows up in the list of available feeds for Google products, simply add a <link> tag with the feed URL to the <head> section of your page. For instance:

<link rel="alternate" type="application/atom+xml" title="Your Feed Title" href="" />

Remember that Feedfetcher-Google retrieves feeds only for use in Google Reader and Personalized Homepage. For the content to appear in web search results, Googlebot will have to crawl it as well.

Don't yet have a feed?

If you use a content management system or blogging platform, feed functionality may be built right now. For instance, if you use Blogger, you can go to Settings > Site Feed and make sure that Publish Site Feed is set to Yes. You can also set the feed to either full or short and can add a footer. The URL listed here is what subscribers add to their feed readers. A link to this URL will appear on your blog.

More tips from the Google Reader team
In order to provide the best experience for your users, the Google Reader team has also put together some tips for feed publishers. This document covers feed best practices, common implementation pitfalls, and various ways to promote your feeds. Whether you're creating your feeds from scratch or have been publishing them for a long time, we encourage you to take a look at our tips to make the most of your feeds. If you have any questions, please get in touch.

Wednesday, February 14, 2007

Our Valentine's day gift: out of beta and adding comments

Here at webmaster central, we love the webmaster community -- and today, Valentine's Day, we want to show you that our commitment to you is stronger than ever. We're taking webmaster tools out of beta and enabling comments on this blog.

Bye, bye beta
We've come a long way since our initial launch of the Sitemaps protocol in June 2005. Since then, we've expanded to a full set of webmaster tools, changed our name, listened to your input, and expanded even more. 2006 was a year of great progress, and we're just getting started. Coming out of beta means that we're committed to partnering with webmasters around the world to provide all the tools and information you need about your sites in our index. Together, we can provide the most relevant and useful search results. And more than a million of you, speaking at least 18 different languages, have joined in that partnership.

In addition to the many new features that we've provided, we've been making lots of improvements behind the scenes to ensure that webmaster tools are reliable, scalable, and secure.

The Sitemaps protocol has evolved into version 0.9, and Microsoft and Yahoo have joined us in that support to provide standards that make it easier for you to communicate with search engines. We're excited about how much information we've been able to learn about your sites and we plan to continue to develop the best ways for you to provide us with information about individual pages on your sites.

Hello, comments
Our goal is improved communication with webmasters, and while our blog, discussion forum, and tools help us reach that goal, you can now post comments and feedback directly on this blog as well. This helps you talk to us about topics we're posting. We want to do all we can to encourage an open dialogue between Google and the webmaster community; this is another avenue to do that.

As always, if you have questions or want to talk about things other than a particular blog post, head over to our discussion forum. You'll find our team there often, answering questions and gathering feedback. And if you haven't already, check out the "links to this post" link under every post to see other discussions of this blog across the web.

Thank you, webmasters, for joining us in this great collaboration. Happy Valentine's Day.

Tuesday, February 13, 2007

Update on Public Service Search

Public Service Search is a service that enables non-profit, university, and government web sites to provide search functionality to their visitors without serving ads. While we've stopped accepting new Public Service Search accounts, if you want to add the functionality of this service to your site, we encourage you to check out the Google Custom Search Engine. Note that if you already have a Public Service Search account, you'll be able to continue offering search results on your site.

A Custom Search Engine can provide you with free web search and site search with the option to specify and prioritize the sites that are included in your search results. You can also customize your search engine to match the look and feel of your site, and if your site is a non-profit, university, or government site, you can choose not to display ads on your results pages.

You have two opportunities to disable ads on your Custom Search Engine. You can select the "Do not show ads" option when you first create a Custom Search Engine, or you can follow the steps below to disable advertising on your existing Custom Search Engine:

1. Click the "My search engines" link on the left-hand side of the Overview page.
2. Click the "control panel" link next to the name of your search engine.
3. Under the "Preferences" section of the Control panel page, select the Advertising status option that reads "Do not show ads on results pages (for non-profits, universities, and government agencies only)."
4. Click the "Save Changes" button.

Remember that disabling ads is available only for non-profit, university, and government sites. If you have a site that doesn't fit into one of these categories, you can still provide search to your visitors using the Custom Search Engine capabilities.

For more information or help with Custom Search Engines, check out the FAQ or post a question to the discussion group.

Monday, February 12, 2007

Come see us at SES London and hear tips on successful site architecture

If you're planning to be at Search Engine Strategies London February 13-15, stop by and say hi to one of the many Googlers who will be there. I'll be speaking on Wednesday at the Successful Site Architecture panel and thought I'd offer up some tips for building crawlable sites for those who can't attend.

Make sure visitors and search engines can access the content
  • Check the Crawl errors section of webmaster tools for any pages Googlebot couldn't access due to server or other errors. If Googlebot can't access the pages, they won't be indexed and visitors likely can't access them either.
  • Make sure your robots.txt file doesn't accidentally block search engines from content you want indexed. You can see a list of the files Googlebot was blocked from crawling in webmaster tools. You can also use our robots.txt analysis tool to make sure you're blocking and allowing the files you intend.
  • Check the Googlebot activity reports to see how long it takes to download a page of your site to make sure you don't have any network slowness issues.
  • If pages of your site require a login and you want the content from those pages indexed, ensure you include a substantial amount of indexable content on pages that aren't behind the login. For instance, you can put several content-rich paragraphs of an article outside the login area, with a login link that leads to the rest of the article.
  • How accessible is your site? How does it look in mobile browsers and screen readers? It's well worth testing your site under these conditions and ensuring that visitors can access the content of the site using any of these mechanisms.

Make sure your content is viewable

  • Check out your site in a text-only browser or view it in a browser with images and Javascript turned off. Can you still see all of the text and navigation?
  • Ensure the important text and navigation in your site is in HTML, not in images, and make sure all images have ALT text that describe them.
  • If you use Flash, use it only when needed. Particularly, don't put all of the text from your site in Flash. An ideal Flash-based site has pages with HTML text and Flash accents. If you use Flash for your home page, make sure that the navigation into the site is in HTML.

Be descriptive

  • Make sure each page has a unique title tag and meta description tag that aptly describe the page.
  • Make sure the important elements of your pages (for instance, your company name and the main topic of the page) are in HTML text.
  • Make sure the words that searchers will use to look for you are on the page.

Keep the site crawlable

  • If possible, avoid frames. Frame-based sites don't allow for unique URLs for each page, which makes indexing each page separately problematic.
  • Ensure the server returns a 404 status code for pages that aren't found. Some servers are configured to return a 200 status code, particularly with custom error messages and this can result in search engines spending time crawling and indexing non-existent pages rather than the valid pages of the site.
  • Avoid infinite crawls. For instance, if your site has an infinite calendar, add a nofollow attribute to links to dynamically-created future calendar pages. Each search engine may interpret the nofollow attribute differently, so check with the help documentation for each. Alternatively, you could use the nofollow meta tag to ensure that search engine spiders don't crawl any outgoing links on a page, or use robots.txt to prevent search engines from crawling URLs that can lead to infinite loops.
  • If your site uses session IDs or cookies, ensure those are not required for crawling.
  • If your site is dynamic, avoid using excessive parameters and use friendly URLs when you can. Some content management systems enable you to rewrite URLs to friendly versions.
See our tips for creating a Google-friendly site and webmaster guidelines for more information on designing your site for maximum crawlability and usability.

If you will be at SES London, I'd love for you to come by and hear more. And check out the other Googlers' sessions too:

Tuesday, February 13th

Auditing Paid Listings & Clickfraud Issues 10:45 - 12:00
Shuman Ghosemajumder, Business Product Manager for Trust & Safety

Wednesday, February 14th

A Keynote Conversation 9:00 - 9:45
Matt Cutts, Software Engineer

Successful Site Architecture 10:30 - 11:45
Vanessa Fox, Product Manager, Webmaster Central

Google University 12:45 - 1:45

Converting Visitors into Buyers 2:45 - 4:00
Brian Clifton, Head of Web Analytics, Google Europe

Search Advertising Forum 4:30 - 5:45
David Thacker, Senior Product Manager

Thursday, February 15th

Meet the Crawlers 9:00 - 10:15
Dan Crow, Product Manager

Web Analytics and Measuring Successful Overview 1:15 - 2:30
Brian Clifton, Head of Web Analytics, Google Europe

Search Advertising Clinic 1:15 - 2:30
Will Ashton, Retail Account Strategist

Site Clinic 3:00 - 4:15
Sandeepan Banerjee, Sr. Product Manager, Indexing

      Monday, February 5, 2007

      Discover your links

      Update on October 15, 2008: For more recent news on links, visit Links Week on our Webmaster Central Blog. We're discussing internal links, outbound links, and inbound links.

      You asked, and we listened: We've extended our support for querying links to your site to much beyond the link: operator you might have used in the past. Now you can use webmaster tools to view a much larger sample of links to pages on your site that we found on the web. Unlike the link: operator, this data is much more comprehensive and can be classified, filtered, and downloaded. All you need to do is verify site ownership to see this information.

      To make this data even more useful, we have divided the world of links into two types: external and internal. Let's understand what kind of links fall into which bucket.

      What are external links?
      External links to your site are the links that reside on pages that do not belong to your domain. For example, if you are viewing links for, all the links that do not originate from pages on any subdomain of would appear as external links to your site.

      What are internal links?

      Internal links to your site are the links that reside on pages that belong to your domain. For example, if you are viewing links for, all the links that originate from pages on any subdomain of, such as or, would appear as internal links to your site.

      Viewing links to a page on your site

      You can view the links to your site by selecting a verified site in your webmaster tools account and clicking on the new Links tab at the top. Once there, you will see the two options on the left: external links and internal links, with the external links view selected. You will also see a table that lists pages on your site, as shown below. The first column of the table lists pages of your site with links to them, and the second column shows the number of the external links to that page that we have available to show you. (Note that this may not be 100% of the external links to this page.)

      This table also provides the total number of external links to your site that we have available to show you.
      When in this summary view, click the linked number and go to the detailed list of links to that page.
      When in the detailed view, you'll see the list of all the pages that link to specific page on your site, and the time we last crawled that link. Since you are on the External Links tab on the left, this list is the external pages that point to the page.

      Finding links to a specific page on your site
      To find links to a specific page on your site, you first need to find that specific page in the summary view. You can do this by navigating through the table, or if you want to find that page quickly, you can use the handy Find a page link at the top of the table. Just fill in the URL and click See details. For example, if the page you are looking for has the URL, you can enter “?main” in the Find a page form. This will take you directly to the detailed view of the links to

      Viewing internal links

      To view internal links to pages on your site, click on the Internal Links tab on the left side bar in the view. This takes you to a summary table that, just like external links view, displays information about pages on your site with internal links to them.

      However, this view also provides you with a way to filter the data further: to see links from any of the subdomain on the domain, or links from just the specific subdomain you are currently viewing. For example, if you are currently viewing the internal links to, you can either see links from all the subdomains, such as links from and, or you can see links only from other pages on

      Downloading links data
      There are three different ways to download links data about your site. The first: download the current view of the table you see, which lets you navigate to any summary or details table, and download the data in the current view. Second, and probably the most useful data, is the list all external links to your site. This allows you to download a list of all the links that point to your site, along with the information about the page they point to and the last time we crawled that link. Thirdly, we provide a similar download for all internal links to your site.

      We do limit the amount of data you can download for each type of link (for instance, you can currently download up to one million external links). Google knows about more links than the total we show, but the overall fraction of links we show is much, much larger than the link: command currently offers. Why not visit us at Webmaster Central and explore the links for your site?