Tuesday, July 31, 2007

Supplemental goes mainstream

When Google originally introduced Supplemental Results in 2003, our main web index had billions of web pages. The supplemental index made it possible to index even more web pages and, just like our main web index, make this content available when generating relevant search results for user queries. This was especially useful for queries that did not return many results from the main web index, and for these the supplemental index allowed us to query even more web pages. The fewer constraints we're able to place on sites we crawl for the supplemental index means that web pages that are not in the main web index could be included in the supplemental. These are often pages with lower PageRank or those with more complex URLs. Thus the supplemental index (read more - and here's Matt's talk about it on video) serves a very important purpose: to index as much of the relevant content that we crawl as possible.

The changes we make must focus on improving the search experience for our users. Since 2006, we've completely overhauled the system that crawls and indexes supplemental results. The current system provides deeper and more continuous indexing. Additionally, we are indexing URLs with more parameters and are continuing to place fewer restrictions on the sites we crawl. As a result, Supplemental Results are fresher and more comprehensive than ever. We're also working towards showing more Supplemental Results by ensuring that every query is able to search the supplemental index, and expect to roll this out over the course of the summer.

The distinction between the main and the supplemental index is therefore continuing to narrow. Given all the progress that we've been able to make so far, and thinking ahead to future improvements, we've decided to stop labeling these URLs as "Supplemental Results." Of course, you will continue to benefit from Google's supplemental index being deeper and fresher.

Wednesday, July 18, 2007

Message Center: Let us communicate with you about your site

Today we're launching our Message Center, a new way for webmasters to receive personalized information from Google in our webmaster console. Should we need to contact you, you'll see a notification in your Webmaster Tools dashboard.

Initially the messages will refer to search quality issues, but over time we'll use the Message Center as a communication channel for more types of information. Here's an example: informing the site owner about hidden text, a violation in our webmaster guidelines.

For our webmasters outside the U.S., we’re also pleased to tell you that Message Center is capable of providing information in all supported Webmaster Tools languages (French, Italian, German, Spanish, Danish, Dutch, Swedish, Russian, Chinese-Simplified, Chinese-Traditional, Korean, Japanese, etc.), across all countries.

Right now the number of sites we’re contacting is small, but we hope to expand this program over time. We’re also really happy that the Message Center lets us communicate with webmasters in an authenticated way. As time goes on, we’ll keep looking for even more ways to improve communication with site owners, but right now, why not claim your site in our webmaster tools so that we can give you a heads-up of any issues that we see?

Friday, July 13, 2007

New warnings feedback

Given helpful suggestions from our
discussion group, we've improved feedback for sitemaps in Webmaster Tools. Now, minor problems in a sitemap will be reported as "warnings," and will appear instead of, or in addition to, more serious "errors." (Previously all problems were listed as errors.) Warnings allow us to provide feedback on portions of your sitemap that may be confusing or inaccurate, while saving the real "error" alarm for problems that make your sitemap completely unreadable. We hope the additional information makes it even easier to share your sitemaps with Google.

The new set of warnings includes many problems that we had previously classified as errors, including the "incorrect namespace" and "invalid date" examples shown in the screenshot above. We also crawl a sample of the URLs listed in your sitemap and report warnings if the Googlebot runs into any trouble with them. These warnings might suggest a widespread problem with your site that warrants further investigation, such as a stale sitemap or a misconfigured robots.txt file.
Please let us know how you like this new feedback. Tell us what you think via the comments below, or in the
discussion group. We also appreciate suggestions for additional warnings that you would find useful.

Thursday, July 5, 2007

Best uses of Flash

We occasionally get questions on the Webmaster Help Group about how webmasters should work with Adobe Flash. I thought it would be worthwhile to write a few words about the search considerations designers should think about when building a Flash-heavy site.

As many of you already know, Flash is inherently a visual medium, and Googlebot doesn't have eyes. Googlebot can typically read Flash files and extract the text and links in them, but the structure and context are missing. Moreover, textual contents are sometimes stored in Flash as graphics, and since Googlebot doesn't currently have the algorithmic eyes needed to read these graphics, these important keywords can be missed entirely. All of this means that even if your Flash content is in our index, it might be missing some text, content, or links. Worse, while Googlebot can understand some Flash files, not all Internet spiders can.

So what's an honest web designer to do? The only hard and fast rule is to show Googlebot the exact same thing as your users. If you don't, your site risks appearing suspicious to our search algorithms. This simple rule covers a lot of cases including cloaking, JavaScript redirects, hidden text, and doorway pages. And our engineers have gathered a few more practical suggestions:

  1. Try to use Flash only where it is needed. Many rich media sites such as Google's YouTube use Flash for rich media but rely on HTML for content and navigation. You can too, by limiting Flash to on-page accents and rich media, not content and navigation. In addition to making your site Googlebot-friendly, this makes you site accessible to a larger audience, including, for example, blind people using screen readers, users of old or non-standard browsers, and those on limited low-bandwidth connections such as on a cell phone or PDA. As a bonus, your visitors can use bookmarks effectively, and can email links to your pages to their friends.
  2. sIFR: Some websites use Flash to force the browser to display headers, pull quotes, or other textual elements in a font that the user may not have installed on their computer. A technique like sIFR still lets non-Flash readers read a page, since the content/navigation is actually in the HTML -- it's just displayed by an embedded Flash object.
  3. Non-Flash Versions: A common way that we see Flash used is as a front page "splash screen" where the root URL of a website has a Flash intro that links to HTML content deeper into the site. In this case, make sure there is a regular HTML link on that front page to a non-Flash page where a user can navigate throughout your site without the need for Flash.

If you have other ideas that don't violate these guidelines that you'd like to ask about, feel free to ask them in the Webmaster Help Group under Crawling, Indexing, and Ranking. The many knowledgeable webmasters there, along with myself and a cadre of other Googlers, will do our best to clear up any confusion.

Update: See our additional blog posts about Flash Indexing at Google.

Monday, July 2, 2007

How to create valuable startpages

In the Dutch market, the concept of so-called 'startpages' is hugely popular. In this article we will give some background information on them, and give those of you who may be startpage webmasters a few tips on how to create unique and informative startpages.

What's a startpage?

Basically, it's a webpage with a lot of links about a specific topic. The startpages are hosted on a startpage domain and each separate startpage is maintained by an individual webmaster. The links on startpages are usually ordered by categories related to the topic of the page. Besides hyperlinks, startpages often contain text, animations and pictures. Startpages are quite unique to the Dutch market, and offer a simple interface for novice users to create their own web portals, with a unique approach to user-generated content.

The whole startpage concept began in September 1998 with the launch of, which was set up to be an online linkbook for the inexperienced Internet user. Since then, has become a huge success, mainly because an enormous number of volunteers created and maintained the different startpages covering lots of interesting and diverse topics. Since emerged, lots of other startpage domains have been created, and are still being created today. The fact that there are still new startpage domains appearing and that the number of individual startpages on these domains is still increasing shows the continued popularity of startpages in the Dutch market.

Creating useful startpages

As a search engine, we love to have useful and diverse pages showing up in the search results we present to our users. We thought it would be a good idea to highlight some of the best practices we've seen in creating value-added startpages.

  1. Create your startpage for users, and not for search engines. This involves making sure that all your text on the page is visible to users, and writing full sentences as descriptions instead of just keywords.
  2. Try to deliver unique, informative and on-topic content. The structure of startpages is pretty straightforward and does not leave much room for variation. However, you can make a difference. Try to find a topic you know a lot about that has not been fully covered yet. Create good categories that are related to your topic and give a relevant title to every category. Then, find links that are related to the categories on your page and label every link with an anchor text that is relevant. For example, instead of naming your links 'link1', 'link2' et cetera, you can choose names that make clear where the link is pointing to. And you can write a short description for every category.
  3. Don't create startpages out of commercial intent or for the sole purpose of exchanging links. Of course there is nothing wrong with trying to monetize your startpage, but a page with only banners and affiliate links is not the best user experience and therefore not recommended. The same goes for startpages that are created as part of a link network. For example, pages that have all links pointing to a particular website and to other startpages that are also pointing to that same website. These kind of link schemes have no added value for the user and go against the Google webmaster guidelines.

With this post, we hope to have provided potential startpage webmasters with some helpful guidelines that will help to create the type of startpages the Dutch speaking people love!

On a final note, we would like to encourage you to fill in a paid links form if you come across a startpage that is involved in buying and selling links for the purpose of search engine manipulation. To report other forms of bad behavior, you can send a spam report. We'll review each report we get and use this feedback to enhance our algorithms and improve our search results. As always, we really appreciate your feedback and your help to provide the best search experience.


Op de Nederlandstalige markt zijn de zogenaamde startpagina's bijzonder populair. In dit artikel willen we, naast het geven van wat achtergrondinformatie over startpagina's, toekomstige startpaginabeheerders een aantal tips geven voor het creëren van unieke en informatieve startpagina's.

Wat is een startpagina?

Een startpagina is een webpagina met een verzameling links gerelateerd aan een specifiek onderwerp. De startpagina's worden gehost op een startpagina domein en elke individuele startpagina wordt beheerd door een webmaster. De links op een startpagina zijn meestal opgedeeld in verschillende categorieën die relevant zijn voor het specifieke onderwerp van de startpagina. Naast een indeling in hyperlinks vind je op een startpagina vaak tekst, animaties en plaatjes. Het concept van startpagina's is redelijk specifiek voor de Nederlandstalige markt en komt nauwelijks voor in andere markten. Startpagina's hebben een simpele interface die het, ook voor de onervaren internetgebruikers, eenvoudig maakt om een eigen webpagina te creëren.

Het startpagina concept kwam tot stand in september 1998 met de lancering van, dat werd opgezet als een soort van linkboek voor de onervaren internet gebruiker. bleek al gauw een enorm succes. Dit succes was vooral te danken aan het enorme aantal vrijwilligers dat meehielp om startpagina's te creëren en beheren. Dat er nu, bijna negen jaar later, nog steeds nieuwe startpagina domeinen verschijnen en dat het aantal individuele startpagina's op deze domeinen nog steeds groeit toont aan dat de startpagina's onverminderd populair zijn.

Een waardevolle startpagina creëren

Als zoekmachine vinden we het fantastisch om waardevolle pagina's met unieke content en diversiteit in onze zoekresultaten te hebben. Het leek ons daarom een goed idee om een aantal tips te geven die kunnen helpen bij het creëren van startpagina's met toegevoegde waarde.

  1. Maak een startpagina voor internetgebruikers en niet voor zoekmachines. Zorg dat alle tekst zichtbaar is en gebruik volledige zinnen in plaats van enkel een aantal keywords.
  2. Probeer unieke, informatieve en aan je onderwerp gerelateerde inhoud aan je bezoekers te presenteren. Hoewel de opzet van een standaard startpagina niet heel veel ruimte biedt voor variatie, kun jij als beheerder het verschil maken! Begin met het zoeken naar een onderwerp waar je veel over weet en waar naar jouw idee nog niet genoeg informatie over te vinden is. Maak vervolgens relevante categorieën aan die gerelateerd zijn aan het onderwerp en geef elke categorie een relevante naam. Zoek vervolgens de links die je op je startpagina wil plaatsen en geef elke link een anchor tekst die omschrijft waar de link je bezoeker naar toe stuurt. Noem je links niet link1, link2, en link3, maar geef ze een naam die relevant is voor de inhoud van de pagina waar de link naar verwijst. Als extra aanvulling kan voor iedere categorie een korte beschrijving worden toegevoegd.
  3. Maak geen startpagina's vanuit een puur commercieel oogpunt. Er is niets mis met te proberen om wat te verdienen met je startpagina, maar vergeet niet dat je bezoekers niet zitten te wachten op een pagina met alleen reclamebanners en affiliate links. Hetzelfde geldt voor startpagina's die enkel worden aangemaakt als onderdeel van een linknetwerk. Een voorbeeld hiervan zijn startpagina's waarbij alle links verwijzen naar eenzelfde website en naar andere startpagina's die ook allemaal naar dezelfde website verwijzen. Dit soort startpagina's hebben geen enkele waarde voor je bezoekers en gaan bovendien in tegen de Google Richtlijnen voor Webmasters.

We hopen dat we met deze eerste Nederlandstalige post potentiële startpaginabeheerders hebben kunnen voorzien van een aantal nuttige tips die er voor zorgen dat zij het soort startpagina's kunnen gaan creëren waar onze Nederlandstalige gebruikers van houden!

Tot slot willen we iedereen aanmoedigen om een paid link formulier in te vullen, wanneer je een startpagina tegenkomt die links koopt en verkoopt om daarmee zoekmachines te manipuleren. Andere zaken die ingaan tegen de Google Richtlijnen voor Webmasters kun je melden door een spamrapport in te sturen. Wij bekijken elk rapport dat wordt ingestuurd en deze informatie wordt gebruikt om onze algoritmes en zoekresultaten verder te verbeteren. Zoals altijd wordt jullie feedback en hulp om onze gebruikers te voorzien van de meest relevante zoekresultaten enorm gewaardeerd!