Wednesday, November 24, 2010
Do you know how Google's crawler, Googlebot, handles conflicting directives in your robots.txt file? Do you know how to prevent a PDF file from being indexed? Do you know Googlebot’s favorite song? The answers to these questions (except for the last one :)), along with lots of other information about controlling the crawling and indexing of your site, are now available on code.google.com:
Controlling crawling and indexing
Now site owners have a comprehensive resource where they can learn about robots.txt files, robots meta tags, and X-Robots-Tag HTTP header directives. Please share your comments, and if you have questions you can post them in our Webmaster Help Forum.
Wednesday, November 17, 2010
Recently we made a change to show more results from a domain for certain types of queries -- this helped searchers get to their desired result even faster. Today we’re expanding the feature so that, when appropriate, more queries show additional results from a domain. As a webmaster, you’ll appreciate the fact that these results may bring targeted visitors directly to the pages they’re interested in.
Here’s an example: in the past, the query [moma] (the Museum of Modern Art), might have triggered two results from the official site:
With this iteration, our search results may show:
- Up to four web results from each domain (i.e., several domains may have multiple results)
- Single-line snippets for the additional results, to keep them compact
Like all the hundreds of changes we make a year, we’re trying to help users quickly reach their desired result. Even though we’re constantly improving our algorithms, our general advice still holds true: create compelling, search-engine friendly sites in order to attract users, buzz, and often targeted traffic!
Written by Harvey Jones, Software Engineer
Thursday, November 11, 2010
We often get questions from webmasters about how we index content designed for Flash Player, so we wanted to take a moment to update you on some of our latest progress.
About two years ago we announced that through a collaboration with Adobe we had significantly improved Google’s capability to index Flash technology based content. Last year we followed up with an announcement that we had added external resource loading to our SWF indexing capabilities. This work has allowed us to index all kinds of textual content in SWF files, from Flash buttons and menus to self-contained Flash technology based websites. Currently almost any text a user can see as they interact with a SWF file on your site can be indexed by Googlebot and used to generate a snippet or match query terms in Google searches. Additionally, Googlebot can also discover URLs in SWF files and follow those links, so if your SWF content contains links to pages inside your website, Google may be able to crawl and index those pages as well.
We’re excited about the progress we’ve made so far and we look forward to keeping you updated about further progress.
Written by Jifeng Situ and Sverre Sundsdal, Software Engineers
Tuesday, November 9, 2010
Today Google introduced Instant Previews, a new search feature that helps people find information faster by showing a visual preview of each result. Traditionally, elements of the search results like the title, URL, and snippet—the text description in each result—help people determine which results are best for them. Instant Previews achieves the same goal with a visual representation of each page and where the relevant content is, instead of a text description. For our webmaster community, this presents an opportunity to reveal the design of your site and why your page is relevant for a particular query. We'd like to offer some thoughts on how to take advantage of the feature.
First of all, it's important to understand what the new feature does. When someone clicks on the magnifying glass on any result, a zoomed-out snapshot of the underlying page appears to the right of the results. Orange highlights indicate where highly relevant content on the page is, and text call outs show search terms in context.
These elements let people know what to expect if they click on that result, and why it's relevant for their query. Our testing shows that the feature really does help with picking the right result—using Instant Previews makes searchers 5% more likely to be satisfied with the results they click.
Many of you have put a lot of thought and effort into the structure of your sites, the layout of your pages, and the information you provide to visitors. Instant Previews gives people a glimpse into that design and indicates why your pages are relevant to their query. Here are some details about how to make good use of the feature.
- Keep your pages clearly laid out and structured, with a minimum of distractions or extraneous content. This is always good advice, since it improves the experience for visitors, and the simplicity and clarity of your site will be apparent via Instant Previews.
- Try to avoid interstitial pages, ad pop-ups, or other elements that interfere with your content. In some cases, these distracting elements may be picked up in the preview of your page, making the screenshots less attractive.
- Many pages have their previews generated as part of our regular crawl process. Occasionally, we will generate screenshots on the fly when a user needs it, and in these situations we will retrieve information from web pages using a new "Google Web Preview" user-agent.
- Instant Previews does not change our search algorithm or ranking in any way. It's the same results, in the same order. There is also no change to how clicks are tracked. If a user clicks on the title of a result and visits your site, it will count as a normal click, regardless of whether the result was previewed. Previewing a result, however, doesn't count as a click by itself.
- Currently, adding the nosnippet meta tag to your pages will cause them to not show a text snippet in our results. Since Instant Previews serves a similar purpose to snippets, pages with the nosnippet tag will also not show previews. However, we encourage you to think carefully about opting out of Instant Previews. Just like regular snippets, previews tend to be helpful to users—in our studies, results which were previewed were more than four times as likely to be clicked on. URLs that have been disallowed in the robots.txt file will also not show Instant Previews.
- Currently, some videos or Flash content in previews appear as a "puzzle piece" icon or a black square. We're working on rendering these rich content types accurately.
We hope you're as excited about this next step in the search results as we are. We're looking forward to many more improvements to Instant Previews in the future.
Posted by Jeremy Silber, Software Engineer
Monday, November 8, 2010
At Google, we continually strive to improve our algorithms to keep search results relevant and clean. You have been supporting us on this mission by sending spam reports for websites that violate our Webmaster Guidelines, using the spam report form in Google Webmaster Tools. While you might not see changes right away, we take your reports seriously and use them to fine-tune our algorithms -- the feedback is much appreciated and helps us to protect the integrity of our search results. We also take manual action on many of these spam reports. A recent blog post covers more information on how to identify webspam.
For those of you who regularly report spam, or would like to do so, we’ve now published a Chrome extension for reporting spam that makes the process more convenient and simple. The extension adds “Report spam” links to search results and your Web History, taking you directly to the spam report form and autocompleting some form fields for you. With this extension, Google’s spam report form is always just one click away.
Google Webspam Report Chrome extension provides further tools to help you quickly fill out a spam report:
- a browser button to report the currently viewed page
- an option to retrieve recent Google searches from your Chrome history
- an option to retrieve recently visited URLs from your Chrome history
The extension is available in 16 languages. If your Chrome browser is set to a language supported by the extension, it will automatically use the localized version, otherwise defaulting to English.
Note: We care about your privacy. The Google Webspam Report Chrome extension allows you to access your personal Chrome history for the purpose of reporting spam, but does not send data retrieved from it to our servers. The source code of the extension has been published under an open source license.
Posted by Manuel Holtz, Support Engineer, Search Quality team
Thursday, November 4, 2010
Everyone who uses the web knows how frustrating it is to land on a page that sounds promising in the search results but ends up being useless when you visit it. We work hard to make sure Google’s algorithms catch as much as possible, but sometimes spammy sites still make it into search results. We appreciate the numerous spam reports sent in by users like you who find these issues; the reports help us improve our search results and make sure that great content is treated accordingly. Good spam reports are important to us. Here’s how to maximize the impact of any spam reports you submit:
Why report spam to Google?
Google’s search quality team uses spam reports as a basis for further improving the quality of the results that we show you, to provide a level playing field for webmasters, and to help with our scalable spam fighting efforts. With the release of new tools like our Chrome extension to report spam, we’ve seen people filing more spam reports and we have to allocate appropriate resources to the spam reports that are mostly likely to be useful.
Spam reports are prioritized by looking at how much visibility a potentially spammy site has in our search results, in order to help us focus on high-impact sites in a timely manner. For instance, we’re likely to prioritize the investigation of a site that regularly ranks on the first or second page over that of a site that only gets a few search impressions per month. A spam report for a page that is almost never seen by users is less likely to be reviewed compared to higher-impact pages or sites. We generally use spam reports to help improve our algorithms so that we can not only recognize and handle this particular site, but also cover any similar sites. In a few cases, we may additionally choose to immediately remove or otherwise take action on a site.
Which sites should I report?
We love seeing reports about spammy sites that our algorithms have missed. That said, it’s a poor use of your time to report sites that are not spammy. Sites submitted through the spam report form are reviewed for spam content only. Sites that you think should be tackled for other reasons should be submitted to us through the appropriate channels: for example, for those that contain content which you have removed, use our URL removal tools; for sites with malware, use the malware report form; for paid links that you find on sites, use the paid links reporting form. If you want to report spammy links for a page, make sure that you read how to report linkspam. If you have a complaint because someone is copying your content, we have a different copyright process--see our official documentation pages for more info. There’s generally no need to report sites with technical problems or parked domains because these are typically handled automatically.
The same applies to redirecting legitimate sites from one top level domain to another, e.g. example.de redirecting to example.com/de. As long as the content presented is not spammy, the technique of redirecting one domain to another does not automatically violate the Google Webmaster Guidelines.
The best way to submit a compelling spam report is to take a good look at the website in question and compare it against the Google Webmaster Guidelines. For instance, these would be good reasons to report a site through the spam report form:
- the cached version contains significantly different (often keyword-rich) content from the live version
- you’re redirected to a completely different domain with off-topic, commercial content
- the site is filled with auto-generated or keyword-stuffed content that seems to make no sense
What should I include in a spam report?
Some spam reports are easier to understand than others; having a clear and easy-to-understand report makes it much easier for us to analyze the issue and take appropriate actions. Here are some things to keep in mind when submitting the spam report:
- Submit the URLs of the pages where you see spam (not just the domain name). This makes it easy for us to verify the problem on those specific pages.
- Try to specify the issue as clearly as possible using the checkboxes. Don’t just check every single box--such reports are less likely to be reviewed.
- If only a part of the page uses spammy techniques, for example if it uses cloaking or has hidden text on an otherwise good page, provide a short explanation on how to look for the spam you’re seeing. If you’re reporting a site for spammy backlinks rather than on-page content, mention that.
What happens next?
After reviewing the feedback from these reports (we want to confirm that the reported sites are actually spammy, not just sites that someone didn’t like), it may take a bit of time before we update our algorithms and a change is visible in the search results. Keep in mind that sometimes our algorithms may already be treating those techniques appropriately; for instance, perhaps we’re already ignoring all the hidden text or the exchanged links that you have reported. Submitting the same spam report multiple times is not necessary. Rest assured that we actively review spam reports and take appropriate actions, even if the changes are not immediately visible to you.
With your help, we hope that we can improve the quality of and fairness in our search results for everyone! Thank you for continuing to submit spam reports and feel free to post here or in our Help Forum should you have any questions.
Written by Kaspar Szymanski, Search Quality Strategist & John Mueller, Webmaster Trends Analyst
Wednesday, November 3, 2010
So today, we’re introducing a module for the Apache HTTP Server called mod_pagespeed to perform many speed optimizations automatically. We’re starting with more than 15 on-the-fly optimizations that address various aspects of web performance, including optimizing caching, minimizing client-server round trips and minimizing payload size. We’ve seen mod_pagespeed reduce page load times by up to 50% (an average across a rough sample of sites we tried) -- in other words, essentially speeding up websites by about 2x, and sometimes even faster.
Here are a few simple optimizations that are a pain to do manually, but that mod_pagespeed excels at:
- Making changes to the pages built by the Content Management Systems (CMS) with no need to make changes to the CMS itself,
- Recompressing an image when its HTML context changes to serve only the bytes required (typically tedious to optimize manually), and
- Extending the cache lifetime of the logo and images of your website to a year, while still allowing you to update these at any time.
"Go Daddy is continually looking for ways to provide our customers the best user experience possible. That's the reason we partnered with Google on the 'Make the Web Faster' initiative. Go Daddy engineers are seeing a dramatic decrease in load times of customers' websites using mod_pagespeed and other technologies provided. We hope to provide the technology to our customers soon - not only for their benefit, but for their website visitors as well.”We’re also working with Cotendo to integrate the core engine of mod_pagespeed as part of their Content Delivery Network (CDN) service.
mod_pagespeed integrates as a module for the Apache HTTP Server, and we’ve released it as open-source for Apache for many Linux distributions. Download mod_pagespeed for your platform and let us know what you think on the project’s mailing list. We hope to work with the hosting, developer and webmaster community to improve mod_pagespeed and make the web faster.
Richard Rabbat, Product Manager, ‘Make the Web Faster’ initiative
Tuesday, November 2, 2010
In time for the holiday season, we now support rich snippets for shopping (e-commerce) sites! As many of you know, rich snippets are search results that have been enhanced using structured data from your web pages. Our new format shows price, availability, and product reviews on pages offering a product for sale. Here’s a result for [office lava lamp]:
As a webmaster, there are two ways that you can provide the information needed for product rich snippets to show up for your site, both described on the Product rich snippets help page:
Option 1: Provide a Merchant Center feed.
Many sites already provide Merchant Center feeds for use in Google Product Search, which means that most of the work needed for rich snippets is already done. For Google to make use of Merchant Center feeds for rich snippets, you should also use the rel=”canonical” link element on your product pages. By adding rel=”canonical” to your pages, Google can match the URLs in your feed to the pages found by our crawler.
Update on November 4, 2010: In order to have your product review information in your rich snippets, you can submit your product ratings directly in your feed, or you can work with one of our reviews partners to submit this information. If you work with a partner, your reviews information will appear in rich snippets, and shoppers on Google Product Search will be able to see your full-length reviews on relevant product pages, branded with your logo.
Option 2: Add markup to your site.
If prices for your products tend to change only infrequently, then adding markup is an alternative method to provide product data for rich snippets. We’ve updated our product markup format to allow a variety of different types of shopping sites to participate. In addition to the Google format, we support two other standards: the hProduct microformat and GoodRelations. You can use the rich snippets testing tool to test your markup and make sure it’s being parsed correctly.
This feature is currently available to merchants located in the US, but we will be rolling it out in more markets soon. Additionally, there are a number of rich snippets formats that can be used world-wide in various languages—make your snippets compelling and useful! Should you have any questions about the use of rich snippets, check out our FAQs and feel free to post in our Webmaster Help Forum.
Which should I provide -- a Merchant Center feed or markup?
For most merchants, providing a Merchant Center feed is the best bet. That way your product prices and availability are updated quickly, and the data can be shown in rich snippets as well as in other applications like Google Shopping and Product Ads. If prices and availability change only infrequently, and you don’t want to set up a feed, then adding markup is also a valid option.
If I add markup to my site, will Google show product rich snippets for my pages?
We can’t guarantee that providing a feed or adding markup will result in rich snippets being shown. Note also that it may take a few weeks after providing data for rich snippets to be shown. If you mark up your pages, we encourage you to make sure that the data is parsed correctly by Google by using the rich snippets testing tool. The testing tool updates are rolling out over the next few days, so in this interim period the testing tool may not show previews for some types of markup.
I’ve already done reviews markup for my product offer pages. Should I add product/offer markup as well?
Yes, absolutely. Rich snippets are shown if the information provided accurately represents the main focus of the page. Therefore, for product pages you should add markup using the relevant offer/product fields which can include nested reviews.
Written by Nitin Shetti and Mircea Ciurumelea, Search Quality team
Monday, November 1, 2010
If all of your sites offer essentially the same content, additional sites are not contributing much to the Internet.