Pages

Wednesday, November 24, 2010

Controlling crawling and indexing now documented on code.google.com

Webmaster level: All

Do you know how Google's crawler, Googlebot, handles conflicting directives in your robots.txt file? Do you know how to prevent a PDF file from being indexed? Do you know Googlebot’s favorite song? The answers to these questions (except for the last one :)), along with lots of other information about controlling the crawling and indexing of your site, are now available on code.google.com:

Controlling crawling and indexing



Now site owners have a comprehensive resource where they can learn about robots.txt files, robots meta tags, and X-Robots-Tag HTTP header directives. Please share your comments, and if you have questions you can post them in our Webmaster Help Forum.

Wednesday, November 17, 2010

Petits fours in your search results

Webmaster Level: All

Recently we made a change to show more results from a domain for certain types of queries -- this helped searchers get to their desired result even faster. Today we’re expanding the feature so that, when appropriate, more queries show additional results from a domain. As a webmaster, you’ll appreciate the fact that these results may bring targeted visitors directly to the pages they’re interested in.

Here’s an example: in the past, the query [moma] (the Museum of Modern Art), might have triggered two results from the official site:


With this iteration, our search results may show:
  • Up to four web results from each domain (i.e., several domains may have multiple results)
  • Single-line snippets for the additional results, to keep them compact
As before, we still provide links to results from a variety of domains to ensure people find a diverse set of sources relevant to their searches. However, when our algorithms predict pages from a particular site are likely to be most relevant, it makes sense to provide additional direct links in our search results.


Like all the hundreds of changes we make a year, we’re trying to help users quickly reach their desired result. Even though we’re constantly improving our algorithms, our general advice still holds true: create compelling, search-engine friendly sites in order to attract users, buzz, and often targeted traffic!

Thursday, November 11, 2010

What a feeling! Even better indexing of SWF content

Webmaster Level: All

We often get questions from webmasters about how we index content designed for Flash Player, so we wanted to take a moment to update you on some of our latest progress.

About two years ago we announced that through a collaboration with Adobe we had significantly improved Google’s capability to index Flash technology based content. Last year we followed up with an announcement that we had added external resource loading to our SWF indexing capabilities. This work has allowed us to index all kinds of textual content in SWF files, from Flash buttons and menus to self-contained Flash technology based websites. Currently almost any text a user can see as they interact with a SWF file on your site can be indexed by Googlebot and used to generate a snippet or match query terms in Google searches. Additionally, Googlebot can also discover URLs in SWF files and follow those links, so if your SWF content contains links to pages inside your website, Google may be able to crawl and index those pages as well.

Last month we expanded our SWF indexing capabilities thanks to our continued collaboration with Adobe and a new library that is more robust and compatible with features supported by Flash Player 10.1. Additionally, thanks to improvements in the way we handle JavaScript, we are also now significantly better at recognizing and indexing sites that use JavaScript to embed SWF content. Finally, we have made improvements in our video indexing technology, resulting in better detection of when a page has a video and better extraction of metadata such as alternate thumbnails from Flash technology based videos. All in all, our SWF indexing technology now allows us to see content from SWF files on hundreds of millions of pages across the web.

While we’ve made great progress indexing SWF content over the past few years, we’re not done yet. We are continuing to work on our ability to index deep linking (content within a Flash technology based application that is linked to from the same application) as well as further improving indexing of SWF files executed through JavaScript. You can help us improve these capabilities by creating unique links for each page that is linked from within a single Flash object and by submitting a Sitemap through Google Webmaster Tools.

We’re excited about the progress we’ve made so far and we look forward to keeping you updated about further progress.

Tuesday, November 9, 2010

Instant Previews

Webmaster Level: Intermediate to Advanced

Today Google introduced Instant Previews, a new search feature that helps people find information faster by showing a visual preview of each result. Traditionally, elements of the search results like the title, URL, and snippet—the text description in each result—help people determine which results are best for them. Instant Previews achieves the same goal with a visual representation of each page and where the relevant content is, instead of a text description. For our webmaster community, this presents an opportunity to reveal the design of your site and why your page is relevant for a particular query. We'd like to offer some thoughts on how to take advantage of the feature.

First of all, it's important to understand what the new feature does. When someone clicks on the magnifying glass on any result, a zoomed-out snapshot of the underlying page appears to the right of the results. Orange highlights indicate where highly relevant content on the page is, and text call outs show search terms in context.

Here’s the Instant Preview for the Google Webmaster Forum.

These elements let people know what to expect if they click on that result, and why it's relevant for their query. Our testing shows that the feature really does help with picking the right result—using Instant Previews makes searchers 5% more likely to be satisfied with the results they click.

Many of you have put a lot of thought and effort into the structure of your sites, the layout of your pages, and the information you provide to visitors. Instant Previews gives people a glimpse into that design and indicates why your pages are relevant to their query. Here are some details about how to make good use of the feature.

  • Keep your pages clearly laid out and structured, with a minimum of distractions or extraneous content. This is always good advice, since it improves the experience for visitors, and the simplicity and clarity of your site will be apparent via Instant Previews.
  • Try to avoid interstitial pages, ad pop-ups, or other elements that interfere with your content. In some cases, these distracting elements may be picked up in the preview of your page, making the screenshots less attractive.
  • Many pages have their previews generated as part of our regular crawl process. Occasionally, we will generate screenshots on the fly when a user needs it, and in these situations we will retrieve information from web pages using a new "Google Web Preview" user-agent.
  • Instant Previews does not change our search algorithm or ranking in any way. It's the same results, in the same order. There is also no change to how clicks are tracked. If a user clicks on the title of a result and visits your site, it will count as a normal click, regardless of whether the result was previewed. Previewing a result, however, doesn't count as a click by itself.
  • Currently, adding the nosnippet meta tag to your pages will cause them to not show a text snippet in our results. Since Instant Previews serves a similar purpose to snippets, pages with the nosnippet tag will also not show previews. However, we encourage you to think carefully about opting out of Instant Previews. Just like regular snippets, previews tend to be helpful to users—in our studies, results which were previewed were more than four times as likely to be clicked on. URLs that have been disallowed in the robots.txt file will also not show Instant Previews.
  • Currently, some videos or Flash content in previews appear as a "puzzle piece" icon or a black square. We're working on rendering these rich content types accurately.

We hope you're as excited about this next step in the search results as we are. We're looking forward to many more improvements to Instant Previews in the future.

Monday, November 8, 2010

A Chrome extension for reporting webspam

Webmaster Level: All

At Google, we continually strive to improve our algorithms to keep search results relevant and clean. You have been supporting us on this mission by sending spam reports for websites that violate our Webmaster Guidelines, using the spam report form in Google Webmaster Tools. While you might not see changes right away, we take your reports seriously and use them to fine-tune our algorithms -- the feedback is much appreciated and helps us to protect the integrity of our search results. We also take manual action on many of these spam reports. A recent blog post covers more information on how to identify webspam.

For those of you who regularly report spam, or would like to do so, we’ve now published a Chrome extension for reporting spam that makes the process more convenient and simple. The extension adds “Report spam” links to search results and your Web History, taking you directly to the spam report form and autocompleting some form fields for you. With this extension, Google’s spam report form is always just one click away.

The Google Webspam Report Chrome extension provides further tools to help you quickly fill out a spam report:
  • a browser button to report the currently viewed page
  • an option to retrieve recent Google searches from your Chrome history
  • an option to retrieve recently visited URLs from your Chrome history
As before, you need to be logged into your Google Account to report spam. You can find a more detailed walkthrough of the use cases and features in this presentation and on the Chrome Extensions Gallery page, where you can also provide feedback and suggestions. We hope that you find this extension useful and that you continue to help us fight spam.

The extension is available in 16 languages. If your Chrome browser is set to a language supported by the extension, it will automatically use the localized version, otherwise defaulting to English.

Note: We care about your privacy. The Google Webspam Report Chrome extension allows you to access your personal Chrome history for the purpose of reporting spam, but does not send data retrieved from it to our servers. The source code of the extension has been published under an open source license.

Thursday, November 4, 2010

How to help Google identify web spam

Webmaster level: All

Everyone who uses the web knows how frustrating it is to land on a page that sounds promising in the search results but ends up being useless when you visit it. We work hard to make sure Google’s algorithms catch as much as possible, but sometimes spammy sites still make it into search results. We appreciate the numerous spam reports sent in by users like you who find these issues; the reports help us improve our search results and make sure that great content is treated accordingly. Good spam reports are important to us. Here’s how to maximize the impact of any spam reports you submit:

Why report spam to Google?

Google’s search quality team uses spam reports as a basis for further improving the quality of the results that we show you, to provide a level playing field for webmasters, and to help with our scalable spam fighting efforts. With the release of new tools like our Chrome extension to report spam, we’ve seen people filing more spam reports and we have to allocate appropriate resources to the spam reports that are mostly likely to be useful.

Spam reports are prioritized by looking at how much visibility a potentially spammy site has in our search results, in order to help us focus on high-impact sites in a timely manner. For instance, we’re likely to prioritize the investigation of a site that regularly ranks on the first or second page over that of a site that only gets a few search impressions per month. A spam report for a page that is almost never seen by users is less likely to be reviewed compared to higher-impact pages or sites. We generally use spam reports to help improve our algorithms so that we can not only recognize and handle this particular site, but also cover any similar sites. In a few cases, we may additionally choose to immediately remove or otherwise take action on a site.

Which sites should I report?

We love seeing reports about spammy sites that our algorithms have missed. That said, it’s a poor use of your time to report sites that are not spammy. Sites submitted through the spam report form are reviewed for spam content only. Sites that you think should be tackled for other reasons should be submitted to us through the appropriate channels: for example, for those that contain content which you have removed, use our URL removal tools; for sites with malware, use the malware report form; for paid links that you find on sites, use the paid links reporting form. If you want to report spammy links for a page, make sure that you read how to report linkspam. If you have a complaint because someone is copying your content, we have a different copyright process--see our official documentation pages for more info. There’s generally no need to report sites with technical problems or parked domains because these are typically handled automatically.

The same applies to redirecting legitimate sites from one top level domain to another, e.g. example.de redirecting to example.com/de. As long as the content presented is not spammy, the technique of redirecting one domain to another does not automatically violate the Google Webmaster Guidelines.


If you happen to come across a gibberish site similar to this one, it’s most likely spam.

The best way to submit a compelling spam report is to take a good look at the website in question and compare it against the Google Webmaster Guidelines. For instance, these would be good reasons to report a site through the spam report form:
  • the cached version contains significantly different (often keyword-rich) content from the live version
  • you’re redirected to a completely different domain with off-topic, commercial content
  • the site is filled with auto-generated or keyword-stuffed content that seems to make no sense
These are just a few examples of techniques that might be potentially spammy, and which we would appreciate seeing in the form of a spam report. When in doubt, please feel free to discuss your concerns on the Help Forum with other users and Google guides.

What should I include in a spam report?

Some spam reports are easier to understand than others; having a clear and easy-to-understand report makes it much easier for us to analyze the issue and take appropriate actions. Here are some things to keep in mind when submitting the spam report:
  • Submit the URLs of the pages where you see spam (not just the domain name). This makes it easy for us to verify the problem on those specific pages.
  • Try to specify the issue as clearly as possible using the checkboxes. Don’t just check every single box--such reports are less likely to be reviewed.
  • If only a part of the page uses spammy techniques, for example if it uses cloaking or has hidden text on an otherwise good page, provide a short explanation on how to look for the spam you’re seeing. If you’re reporting a site for spammy backlinks rather than on-page content, mention that.
By following these guidelines, your spam reports will be reproducible and clear, making them easier to analyze on our side.

What happens next?

After reviewing the feedback from these reports (we want to confirm that the reported sites are actually spammy, not just sites that someone didn’t like), it may take a bit of time before we update our algorithms and a change is visible in the search results. Keep in mind that sometimes our algorithms may already be treating those techniques appropriately; for instance, perhaps we’re already ignoring all the hidden text or the exchanged links that you have reported. Submitting the same spam report multiple times is not necessary. Rest assured that we actively review spam reports and take appropriate actions, even if the changes are not immediately visible to you.

With your help, we hope that we can improve the quality of and fairness in our search results for everyone! Thank you for continuing to submit spam reports and feel free to post here or in our Help Forum should you have any questions.

Wednesday, November 3, 2010

Make your websites run faster, automatically -- try mod_pagespeed for Apache

Webmaster Level: All

Last year, as part of Google’s initiative to make the web faster, we introduced Page Speed, a tool that gives developers suggestions to speed up web pages. It’s usually pretty straightforward for developers and webmasters to implement these suggestions by updating their web server configuration, HTML, JavaScript, CSS and images. But we thought we could make it even easier -- ideally these optimizations should happen with minimal developer and webmaster effort.

So today, we’re introducing a module for the Apache HTTP Server called mod_pagespeed to perform many speed optimizations automatically. We’re starting with more than 15 on-the-fly optimizations that address various aspects of web performance, including optimizing caching, minimizing client-server round trips and minimizing payload size. We’ve seen mod_pagespeed reduce page load times by up to 50% (an average across a rough sample of sites we tried) -- in other words, essentially speeding up websites by about 2x, and sometimes even faster.

Comparison of the AdSense blog site with and without mod_pagespeed


Here are a few simple optimizations that are a pain to do manually, but that mod_pagespeed excels at:
  • Making changes to the pages built by the Content Management Systems (CMS) with no need to make changes to the CMS itself,
  • Recompressing an image when its HTML context changes to serve only the bytes required (typically tedious to optimize manually), and
  • Extending the cache lifetime of the logo and images of your website to a year, while still allowing you to update these at any time.
We’re working with Go Daddy to get mod_pagespeed running for many of its 8.5 million customers. Warren Adelman, President and COO of Go Daddy, says:
"Go Daddy is continually looking for ways to provide our customers the best user experience possible. That's the reason we partnered with Google on the 'Make the Web Faster' initiative. Go Daddy engineers are seeing a dramatic decrease in load times of customers' websites using mod_pagespeed and other technologies provided. We hope to provide the technology to our customers soon - not only for their benefit, but for their website visitors as well.”
We’re also working with Cotendo to integrate the core engine of mod_pagespeed as part of their Content Delivery Network (CDN) service.

mod_pagespeed integrates as a module for the Apache HTTP Server, and we’ve released it as open-source for Apache for many Linux distributions. Download mod_pagespeed for your platform and let us know what you think on the project’s mailing list. We hope to work with the hosting, developer and webmaster community to improve mod_pagespeed and make the web faster.

Tuesday, November 2, 2010

Rich snippets for shopping sites

Webmaster Level: All

In time for the holiday season, we now support rich snippets for shopping (e-commerce) sites! As many of you know, rich snippets are search results that have been enhanced using structured data from your web pages. Our new format shows price, availability, and product reviews on pages offering a product for sale. Here’s a result for [office lava lamp]:


As a webmaster, there are two ways that you can provide the information needed for product rich snippets to show up for your site, both described on the Product rich snippets help page:

Option 1: Provide a Merchant Center feed.

Many sites already provide Merchant Center feeds for use in Google Product Search, which means that most of the work needed for rich snippets is already done. For Google to make use of Merchant Center feeds for rich snippets, you should also use the rel=”canonical” link element on your product pages. By adding rel=”canonical” to your pages, Google can match the URLs in your feed to the pages found by our crawler.

Update on November 4, 2010: In order to have your product review information in your rich snippets, you can submit your product ratings directly in your feed, or you can work with one of our reviews partners to submit this information. If you work with a partner, your reviews information will appear in rich snippets, and shoppers on Google Product Search will be able to see your full-length reviews on relevant product pages, branded with your logo.

Option 2: Add markup to your site.

If prices for your products tend to change only infrequently, then adding markup is an alternative method to provide product data for rich snippets. We’ve updated our product markup format to allow a variety of different types of shopping sites to participate. In addition to the Google format, we support two other standards: the hProduct microformat and GoodRelations. You can use the rich snippets testing tool to test your markup and make sure it’s being parsed correctly.

This feature is currently available to merchants located in the US, but we will be rolling it out in more markets soon. Additionally, there are a number of rich snippets formats that can be used world-wide in various languages—make your snippets compelling and useful! Should you have any questions about the use of rich snippets, check out our FAQs and feel free to post in our Webmaster Help Forum.

Q&A

Which should I provide -- a Merchant Center feed or markup?

For most merchants, providing a Merchant Center feed is the best bet. That way your product prices and availability are updated quickly, and the data can be shown in rich snippets as well as in other applications like Google Shopping and Product Ads. If prices and availability change only infrequently, and you don’t want to set up a feed, then adding markup is also a valid option.

If I add markup to my site, will Google show product rich snippets for my pages?

We can’t guarantee that providing a feed or adding markup will result in rich snippets being shown. Note also that it may take a few weeks after providing data for rich snippets to be shown. If you mark up your pages, we encourage you to make sure that the data is parsed correctly by Google by using the rich snippets testing tool. The testing tool updates are rolling out over the next few days, so in this interim period the testing tool may not show previews for some types of markup.

I’ve already done reviews markup for my product offer pages. Should I add product/offer markup as well?

Yes, absolutely. Rich snippets are shown if the information provided accurately represents the main focus of the page. Therefore, for product pages you should add markup using the relevant offer/product fields which can include nested reviews.

Monday, November 1, 2010

Best practices for running multiple sites

Webmaster Level: All

Running a single compelling, high quality site can be time- and resource-consuming, not to mention the creativity it requires to make the site a great one. At times–particularly when it comes to rather commercial topics like foreign currency exchange or online gambling–we see that some webmasters try to compete for visibility in Google search results with a large
number of sites on the same topic. There are a few things to keep in mind when considering a strategy like this for sites that you want to have listed in our search results.

Some less creative webmasters, or those short on time but with substantial resources on their hands, might be tempted to create a multitude of similar sites without necessarily adding unique information to any of these. From a user’s perspective, these sorts of repetitive sites can constitute a poor user experience when visible in search results. Luckily, over time our algorithms have gotten pretty good at recognizing similar content so as to serve users with a diverse range of information. We don’t recommend creating similar sites like that; it’s not a good use of your time and resources.


If all of your sites offer essentially the same content, additional sites are not contributing much to the Internet.

While you’re free to run as many sites as you want, keep in mind that users prefer to see unique and compelling content. It is a good idea to give each site its own content, personality and function. This is true of any website, regardless of whether it’s a single-page hobby-site or part of a large portfolio. When you create a website, try to add something new or some value to the Internet; make something your users have never seen before, something that inspires and fascinates them, something they can’t wait to recommend to their friends.

When coming up with an idea for a website, scan the web first. There are many websites dealing with common and popular services like holiday planning, price comparisons or foreign exchange currency trading. It frequently doesn’t make sense to reinvent the wheel and compete with existing broad topic sites. It’s often more practical and rewarding to focus on smaller or niche topics where your expertise is best and where competition for user attention might be less fierce.

A few webmasters choose to focus their resources on one domain but make use of their domain portfolio by creating a multitude of smaller sites linking to it. In some situations these sites may be perceived as doorways. Without value of their own, these doorway sites are unlikely to stand the test of time in our search results. If you registered several domains but only want to focus on one topic, we recommend you create unique and compelling content on each domain or simply 301 redirect all users to your preferred domain. Think of your web endeavour as if it were a restaurant: You want each dish to reflect the high quality of the service you provide; repeat the same item over and over on your menu and your restaurant might not do so well. Identify and promote your strength or uniqueness. Ask yourself the following questions: What makes you better than the competition? What new service do you provide that others don’t? What makes your sites unique and compelling enough to make users want to revisit them, link to them or even recommend them to their friends?

We suggest not spreading out your efforts too broadly, though. It can be difficult to maintain multiple sites while keeping the content fresh and engaging. It’s better to have one or a few good sites than a multitude of shallow, low value-add sites. As always, we encourage you to share your thoughts via comments as well as by contributing to the Google Webmaster community.