Pages

Monday, October 31, 2011

Raising awareness of cross-domain URL selections

Webmaster level: Advanced

A piece of content can often be reached via several URLs, not all of which may be on the same domain. A common example we’ve talked about over the years is having the same content available on more than one URL, an issue known as duplicate content. When we discover a group of pages with duplicate content, Google uses algorithms to select one representative URL for that content. A group of pages may contain URLs from the same site or from different sites. When the representative URL is selected from a group with different sites the selection is called a cross-domain URL selection. To take a simple example, if the group of URLs contains one URL from a.com and one URL from b.com and our algorithms select the URL from b.com, the a.com URL may no longer be shown in our search results and may see a drop in search-referred traffic.

Webmasters can greatly influence our algorithms’ selections using one of the currently supported mechanisms to indicate the preferred URL, for example using rel="canonical" elements or 301 redirects. In most cases, the decisions our algorithms make in this regard correctly reflect the webmaster’s intent. However, in some rare cases we’ve also found many webmasters are confused as to why it has happened and what they can do if they believe the selection is incorrect.

To be transparent about cross-domain URL selection decisions, we’re launching new Webmaster Tools messages that will attempt to notify webmasters when our algorithms select an external URL instead of one from their website. The details about how these messages work are in our Help Center article about the topic, and in this blog post we’ll discuss the different scenarios in which you may see a cross-domain URL selection and what you can do to fix any selections you believe are incorrect.

Common causes of cross-domain URL selection

There are many scenarios that can lead our algorithms to select URLs across domains.

In most cases, our algorithms select a URL based on signals that the webmaster implemented to influence the decision. For example, a webmaster following our guidelines and best practices for moving websites is effectively signalling that the URLs on their new website are the ones they prefer for Google to select. If you’re moving your website and see these new messages in Webmaster Tools, you can take that as confirmation that our algorithms have noticed.

However, we regularly see webmasters ask questions when our algorithms select a URL they did not want selected. When your website is involved in a cross-domain selection, and you believe the selection is incorrect (i.e. not your intention), there are several strategies to improve the situation. Here are some of the common causes of unexpected cross-domain URL selections that we’ve seen, and how to fix them:

  1. Duplicate content, including multi-regional websites: We regularly see webmasters use substantially the same content in the same language on multiple domains, sometimes inadvertently and sometimes to geotarget the content. For example, it’s common to see a webmaster set up the same English language website on both example.com and example.net, or a German language website hosted on a.de, a.at, and a.ch.

    Depending on your website and your users, you can use one of the currently-supported canonicalization techniques to signal to our algorithms which URLs you wish selected. Please see the following articles about this topic:

  2. Configuration mistakes: Certain types of misconfigurations can lead our algorithms to make an incorrect decision. Examples of misconfiguration scenarios include:
    1. Incorrect canonicalization: Incorrect usage of canonicalization techniques pointing to URLs on an external website can lead our algorithms to select the external URLs to show in our search results. We’ve seen this happen with misconfigured content management systems (CMS) or CMS plugins installed by the webmaster.

      To fix this kind of situation, find how your website is incorrectly indicating the canonical URL preference (e.g. through incorrect usage of a rel="canonical" element or a 301 redirect) and fix that.

    2. Misconfigured servers: Sometimes we see hosting misconfigurations where content from site a.com is returned for URLs on b.com. A similar case occurs when two unrelated web servers return identical soft 404 pages that we may fail to detect as error pages. In both situations we may assume the same content is being returned from two different sites and our algorithms may incorrectly select the a.com URL as the canonical of the b.com URL.

      You will need to investigate which part of your website’s serving infrastructure is misconfigured. For example, your server may be returning HTTP 200 (success) status codes for error pages, or your server might be confusing requests across different domains hosted on it. Once you find the root cause of the issue, work with your server admins to correct the configuration.

  3. Malicious website attacks: Some attacks on websites introduce code that can cause undesired canonicalization. For example, the malicious code might cause the website to return an HTTP 301 redirect or insert a cross-domain rel="canonical" link element into the HTML <head> or HTTP header, usually pointing to an external URL hosting malicious content. In these cases our algorithms may select the malicious or spammy URL instead of the URL on the compromised website.

    In this situation, please follow our guidance on cleaning your site and submit a reconsideration request when done. To identify cloaked attacks, you can use the Fetch as Googlebot function in Webmaster Tools to see your page’s content as Googlebot sees it.

In rare situations, our algorithms may select a URL from an external site that is hosting your content without your permission. If you believe that another site is duplicating your content in violation of copyright law, you may contact the site’s host to request removal. In addition, you can request that Google remove the infringing page from our search results by filing a request under the Digital Millennium Copyright Act.

And as always, if you need help in identifying the cause of an incorrect decision or how to fix it, you can see our Help Center article about this topic and ask in our Webmaster Help Forum.

Tuesday, October 18, 2011

Accessing search query data for your sites

Webmaster level: All

SSL encryption on the web has been growing by leaps and bounds. As part of our commitment to provide a more secure online experience, today we announced that SSL Search on https://www.google.com will become the default experience for signed in users on google.com. This change will be rolling out over the next few weeks.

What is the impact of this change for webmasters? Today, a web site accessed through organic search results on http://www.google.com (non-SSL) can see both that the user came from google.com and their search query. (Technically speaking, the user’s browser passes this information via the HTTP referrer field.) However, for organic search results on SSL search, a web site will only know that the user came from google.com.

Webmasters can still access a wealth of search query data for their sites via Webmaster Tools. For sites which have been added and verified in Webmaster Tools, webmasters can do the following:
  • View the top 1000 daily search queries and top 1000 daily landing pages for the past 30 days.
  • View the impressions, clicks, clickthrough rate (CTR), and average position in search results for each query, and compare this to the previous 30 day period.
  • Download this data in CSV format.
In addition, users of Google Analytics’ Search Engine Optimization reports have access to the same search query data available in Webmaster Tools and can take advantage of its rich reporting capabilities.

We will continue to look into further improvements to how search query data is surfaced through Webmaster Tools. If you have questions, feedback or suggestions, please let us know through the Webmaster Tools Help Forum.

Wednesday, October 12, 2011

Create and manage Custom Search Engines from within Webmaster Tools

Webmaster level: All

Custom Search Engines (CSEs) enable you to create Google-powered customized search experiences for your sites. You can search over one or more sites, customize the look and feel to match your site, and even make money with AdSense for Search. Now it’s even easier to get started directly from Webmaster Tools.

If you’ve never created a CSE, just click on the “Custom Search” link in the Labs section and we’ll automatically create a default CSE that searches just your site. You can do some basic configuring or immediately get the code snippet to add your new CSE to your site. You can always continue on to the full CSE control panel for more advanced settings.

Once you’ve created your CSE (or if you already had one), clicking the “Custom Search” link in Labs will allow you to manage your CSEs without leaving Webmaster Tools.

We hope these new features make it easier for you to help users search your site. If you have any questions, please post them in our Webmaster Help Forum or the Custom Search Help Forum.

Wednesday, October 5, 2011

Webmaster forums' Top Contributors rock

Webmaster level: All

The TC Summit was a blast! As we wrote in our announcement post, we recently invited more than 250 Top Contributors from all over the world to California to thank them for being so awesome and to give them the opportunity to meet some of our forum guides, engineers and product managers in person.

Our colleagues Adrianne and Brenna already published a recap post on the Official Google Blog. As for us, the search folks at Google, there's not much left to say except that we enjoyed the event and meeting Top Contributors in real life, many of them for the first time. We got the feeling you guys had a great time, too. Let’s quote a few of the folks who make a huge difference on a daily basis:

Sasch Mayer on Google+ (Webmaster TC in English):

"For a number of reasons this event does hold a special place for me, and always will. It's not because I was one of comparatively few people to be invited for a Jolly at the ‘Plex, but because this trip offered the world's TCs a unique opportunity to finally meet each other in person."

Herbert Sulzer, a.k.a. Luzie on Google+ (Webmaster TC in English, German and Spanish):

“Hehehe! Fun, fun fun, this was all fun :D Huhhh”

Aygul Zagidullina on Google+ (Web Search TC in English):

“It was a truly fantastic, amazing, and unforgettable experience meeting so many other TCs across product forums and having the chance to talk to and hear from so many Googlers across so many products!”

Of course we did receive lots of constructive feedback, too. Transparency and communication were on top of the list, and we're looking into increasing our outreach efforts via Webmaster Tools, so stay tuned! By the way, if you haven’t done so yet, please remember to use the forwarding option in the Webmaster Tools Message Center to get the messages straight to your email inbox. In the meantime please keep an eye on our Webmaster Central Blog, and of course keep on contributing to discussions in the Google Webmaster Forum.

On behalf of all Google guides who participated in the 2011 Summit we want to thank you. You guys rock! :)


That’s right, TCs & Google Guides came from all over the world to convene in California.


TCs & Google Guides from Webmaster Central and Search forums after one of the sessions.


After a day packed with presentations and breakout sessions...


...we did what we actually came for...


...enjoyed a party, celebrated and had a great time together.

Tuesday, October 4, 2011

Webmaster Tools Search Queries data is now available in Google Analytics

Webmaster level: All

Earlier this year we announced a limited pilot for Search Engine Optimization reports in Google Analytics, based on Search queries data from Webmaster Tools. Thanks to valuable feedback from our pilot users, we’ve made several improvements and are pleased to announce that the following reports are now publicly available in the Traffic Sources section of Google Analytics.
  • Queries: impressions, clicks, position, and CTR info for the top 1,000 daily queries
  • Landing Pages: impressions, clicks, position, and CTR info for the top 1,000 daily landing pages
  • Geographical Summary: impressions, clicks, and CTR by country
All of these Search Engine Optimization reports offer Google Analytics’ advanced filtering and visualization capabilities for deeper data analysis. With the secondary dimensions, you can view your site’s data in ways that aren’t available in Webmaster Tools.


To enable these Search Engine Optimization reports for a web property, you must be both a Webmaster Tools verified site owner and a Google Analytics administrator of that Property. Once enabled, administrators can choose which profiles can see these reports.

If you have feedback or suggestions, please let us know in the Webmaster Help Forum.