Google reveals new version of its search engine

Google today unveiled project "Caffeine", the codename for the secret project to develop the next-generation of the company's search-engine infrastructure.

Web developers have been invited to test the new version of its search engine, and provide feedback about differences observed in results from the current and new system.

The preview was announced today by the technical lead, Matt Cutt, on the company's official Webmaster Central Blog.

For the last several months, a large team of Googlers has been working on a secret project: a next-generation architecture for Google's web search. It's the first step in a process that will let us push the envelope on size, indexing speed, accuracy, comprehensiveness and other dimensions. The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results. But web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback.

I welcome this opportunity to preview the new service and report back with my observations. Preliminary data from my own tests so far -- covering key metrics including search speed, the number of results returned, and relevance -- don't indicated any big differences between the current and new version of the search engine. Qualitative and quantitative differences -- perceived and measured -- are either not big enough to be readily apparent or not consistent enough to support general conclusions. There are obvious differences in the layout of the results page, e.g. news and video sections are now both at top for applicable queries. Comparisons with Bing are inevitable, since the Caffeine sandbox seems to be copying Microsoft's search engine.

The test site is not yet finished and currently has a number of issues, such as broken links and other errors, about which Google's engineers do not currently require feedback. [Note: on 13 August Google's test service was down, with an error message saying the data centre was being updated.]

10 August 2009

Share the love:

Comments: 5

Add Comment

Tim Acheson (11 Aug 09, 15:15)

Predictably, yesterday's news is generating a lot of hype and hysteria. Everybody wants a piece of the action. People are so keen to have a say, it seems many of them do so without reading Google's announcement, since a number of commentators seem to have missed the point of this test.

The Daily Telegraph in the UK provides a typical example:

Martin McNulty, director of search marketing specialist, Trafficbroker, who has tried the new version, said: “Google's caffeine is undoubtedly faster, almost twice as fast at times. It's like a Google GTi.

The excitement is more infectious than swine flu. That single quote turned into the headline: "Google reveals caffeine: a new faster search engine".

In reality, this test is not about the speed with which results are returned. I wouldn't suggest it's faster, either, and if I did we could safely assume it's because the new version has very much less traffic. The sandbox test probably has a tiny fraction of one percent of the load on the existing search engine.

As Google points out, the change only affects the infrastructure, and the only mention of speed is "indexing speed" which means the speed with which Google captures useful data as it crawls the web. As Google points out:-

The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results.

It's an endless source of entertainment, to see how the power of suggestion affects the psychology of other commentators. Google’s reputation is literally intoxicating -- intelligent adults abandon rational thought under its influence. Is there an implicit assumption that anything Google does must be impressive in every imaginable way? "Wow, Google did something, so it must work faster now!"

Tim Acheson (13 Aug 09, 17:24)

I noticed that the BBC jumped on the same bandwagon as everybody else, quoting the same person as the Telegraph, and basing their headline on the same unscientific claim: "New Google 'puts Bing in shade'". This is often how mainstream "news" is generated. A spurious sensationalist quote is published somewhere, then the same thing is regurgitated elsewhere with minor adaptations, in a process of dissemination and legitimisation that can be extremely rapid. This is a good example of contemporary mass-hysteria.

Tim Acheson (19 Aug 09, 11:56)

Querystring Spam in Google's new search engine

Among the issues I've identified and reported so far in Google's Caffeine sandbox is an interesting form of URL manipulation Spam that I haven't seen before, which could indicate a new vulnerability.

E.g. In a search for "open directory project" the first result listed is dmoz.org. But notice that appended to the URL is the domain name of a Turkish search engine: "www.dmoz.org/?id=www.interturknet.com/".

Presumably somebody is deliberately linking to dmoz.org usng a URL with this rogue query string parameter. The problem doesn't affect the current version of Google search. I've observed this phenomenon with minor web sites before -- but never such a major URLs for which there must be millions of correct links, and an extremely high ratio of correct links vs links with querystring Spam.

Tim Acheson (10 Nov 09, 11:03)

Update: this update to Google's indexing system looks likely to go live soon, and the preview is now offline.

"Based on the success we’ve seen, we believe Caffeine is ready for a larger audience. Soon we will activate Caffeine more widely, beginning with one data center. This sandbox is no longer necessary and has been retired, but we appreciate the testing and positive input that webmasters and publishers have given."

Tags:


  • Twitter
  • LinkedIn
  • Facebook
  • Windows Live / Messenger
  • Xbox Live
  • RSS
  • Email