Access Google’s data about your Website

December 13th, 2005

The official Adsense blog has a pointer informing us that the Google Sitemaps service has a handy extra feature that lets webmasters view statistics on their websites from Google’s perspective.

By just creating a Sitemaps account and uploading an empty HTML file to verify you own the website, Google provides statistics about the Pagerank distribution across all of your pages, the top queries to Google that return pages to your site as results, and the top queries that actually directed traffic your way.

In addition, you can check exactly what errors (if any) the Googlebot encountered while crawling your site.

As an extra convenience, they also throw in a handy “Index stats” section that lets you run advanced queries on your site like the link:, site: and allinurl: commands.

Google Sitemaps home
Google Sitemaps help: Viewing statistics and errors for your site

Webmasterworld Shuts Out The Search Engines

November 23rd, 2005

I noticed a very interesting thread on one of my favourite sites, Brett Tabke’s Webmasterworld, in which he has decided to completely ban all search engine robots from the site with a robots.txt file.

This means no pages from Webmasterworld will be findable in Yahoo, Google, MSN or any other legitimate search engine. The main problem members have with the idea is that Webmasterworld is an enormous site which I think has in the region of half a million forum posts… and no decent onsite search engine of it’s own.

The main method members used for finding anything on the site was through a well-formed query on a search engine, so the task is much more difficult now. Tabke says a new site search solution is in the pipeline, but it doesn’t appear that it’ll be online immediately.

Heavy spidering by rogue search crawlers is the main reason given by Tabke for the move, which will last for “a month or three”:

we have been doing EVERYTHING you can think of. This is a part of that ongoing process. We can’t require all people to login and allow bots onto the site (eg: pure cloaking). Even the random ad scripts we cloak off to keep bots from seeing session id like content, gets grumbles from alot of members. The claims are that we are either selling links (which they claimed about our links to westhost and now rackspace are paid), or claim we are cloaking to get higher pr when we do block bots from seeing session ids. eg: no win situation for us.

So, we start by banning bots, and then follow immediatly with required cookies/logins for everyone. That will stop most of the bots. The ones it don’t, we will follow up with session id’s, and auto ban in htaccess for page view abuse. Lastly, we will move to captcha logins, and then random login challenges with other captcha gfx requirements.

lets try this for a month or three…

Related:
WebmasterWorld Bans Spiders From Crawling

Add to Google chicklets set to take over Web

November 21st, 2005


Add to GoogleGoogle has released an official “Add to Google” button for bloggers to put on their sites, that lets their readers subscribe to the blog feed in Google’s RSS Reader or the Personalised Google.

The buttons are similar to the ubiquitous Add to My Yahoo! images which proliferated across the web back in 2004, and it probably won’t be long before every blog in christendom sports a Google one as well.

Click here to see what a typical landing page looks like.

Via Jeremy Zawodny’s linkblog

All Your Base Are Belong To Google

November 16th, 2005

Google BaseJust days after the shaky release of Google Analytics, Google have announced one of their more curious offerings is now available, albeit this time under a Beta label.

I have to admit I was initially puzzled and still am in a way by Google Base - it’s purpose doesn’t jump out on first glance.

It’s also difficult to explain in a simple comparison with something else, though the Web seems intent on comparing it to either eBay or craigslist.

It isn’t mainly concerned with just online buying-&-selling like those businesses however, and works more as a simple web-publishing system where people can post any type of information - not just advertise - with a view of reaching an audience.

Depending on the type of information you upload to your Base, it is then made available to all of the other users searching on Google Base, and may even get propagated out to Froogle, Google Local or to Google.com’s search index, depending on which one it is relevant to.

One basic advantage I can see of Google Base is to Google itself - it allows Google to index information that the normal G search engine would find impossible to index itself; for instance in the FAQ, they suggest users add offline content to the Base.

As the company’s stated goal is to organize the world’s information and make it universally accessible, what better way to achieve this than to offer the world a way to easily send the information that the web crawler can’t reach itself, and offer the users an incentive for doing so - ie. users receive a portion of Google’s audience for their item in exchange for adding a little more to Google’s vast collection of information.

For more on Google Base:
Official Google Base FAQ
Memeorandum
John Battelle: All Your Base Are Google, The Launch
Google Base creates a structured Web

Google Analytics Off To A Rough Start

November 15th, 2005

GoogleLike many others eager to try out Google’s new traffic-tracking service, I added the Google Analytics javascript to my sites yesterday, and was informed that I had added it correctly and my stats would begin to appear in twelve hours.
However, it’s now 27 hours later and I’m still seeing no stats, a problem a lot of people seem to be having.

G Analytics is one of the few new Google products to be released that doesn’t carry a ‘Beta’ label, but appears to be having some serious problems for an official product release.

The blogosphere’s reaction yesterday was initial excitement, followed by lots of questions about why the site was so slow and frustration at constantly seeing an “under maintenance” message when trying to log in.
Those problems appear to have been solved - the site seems zippy enough today, but myself and many others have witnessed a much longer delay than 12 hours in the system actually showing stats, and the Help system seems to be only half-complete, with lots of the pages throwing a 404.

I’m eager to try the service out properly - the Flash & DHTML-based piecharts and graphs look great, and there’s lots of different metrics to look at, but as of this posting, no actual data from my site.

In the meantime, I’ve been searching for more articles and opinion about the service, and came across an interesting one in the latest Axandra search newsletter, which focuses more on the privacy concerns surrounding the product and Google in general:

Google already knows a lot of things about you. If you also use their new tracking service, you will tell Google how much you earn, when you earn it, which products you sell, how often you sell them, how much you spend for ads on other sites and you will reveal much more information about your online business.

Ask yourself if you want Google to know that much about you and your company. Do you really want to share your revenue information with a company that also wants your advertising dollars? Do you want to share your revenue information with any other company at all?

Google officials have declined that they will use the data to better understand how much you are willing to pay for ads, based on conversions. They also claim that they do not plan to tap into the data as a means of improving regular search results or to identify bad sites. Nevertheless, these things are easily possible if you use Google Analytics.

Google engineer Matt Cutts even writes in his blog: “Blackhat SEOs may be leery of using Google for analytics, but regular site owners should be reassured.” That sounds as if Google might actually use the information for other purposes.

Google Analytics - is it worth its price?

More:
Problogger: Google Analytics - First Impressions
Google: Start acting like a real business or you’re doomed

Google Announce Free Web Analytics For All

November 14th, 2005


Google AnalyticsGoogle have finally made the Urchin Web Analytics technology they acquired back in March a free product for all.

The newly launched Google Analytics service allows web site owners track exactly how visitors found their sites and how they interact with them. The service is sure to be a major threat to the current stable of traffic-tracking providers.

Google Analytics looks to be basically a hosted rebranding of Urchin Web Analytics, which previously cost $199 per month, and required users to download the software to their machines.

Under G Analytics, all site operators have to do is create an account, paste a snippet of Javascript into their pages and voila, they’re done.

The press release announcing the service boasts that it runs on Google’s vast computer infrastructure, and will be able to “support the traffic demands of any site, from
those with a few visitors a week to hundreds of millions” - however it appears to be having some initial teething problems, and users trying to sign up are greeted with an extremely slow loading time and persistent “under maintenance” messages.

Introducing Google Analytics.
Sophisticated. Easy. Free.

Web Analytics Free of Charge, Courtesy of Google
Google Analytics Official Forum & Discussion
Google Blog: The circle of analytics
Google’s “Conversion University”

Jeeves To Be Phased Out of Ask.com

September 24th, 2005


Ask JeevesAsk.com have revealed the famous Jeeves butler is to be phased out completely and removed from the site.

In a statement, the company explained that Jeeves is giving the impression to searchers that the only service Ask
offers is the ability to perform a search query with a straightforward question; less prominent are its other tools like desktop search and the excellent Bloglines RSS reader.

No specific timeline for the removal of Jeeves has yet been announced, and it’s unclear if a new brand name for the site will be introduced.

Ask Jeeves decides to axe Jeeves

Yahoo! Launch Instant Search Beta

September 17th, 2005


Yahoo" Instant Search Beta

A new beta search tool has been posted up on Yahoo’s Next website, the showcase site for their new products.

Instant Search is an AJAX-based interface for Yahoo search that lets you do a search and go straight to your destination without ever having to see a results page - as you type in the search field, popular searches and shortcuts to Yahoo’s services are grabbed dynamically and presented in a pop-up bubble. For instance, typing “mail” brings up the shortcut to Yahoo Mail, which lets you visit with one click.

The service takes a dig at Google’s famous “I’m Feeling Lucky” tool that sends you to the top result without viewing a search results page; the description for Instant Search asks “Why feel lucky when you can be right? With Instant Search, find the information you want — without even hitting the search button.”.

The dynamic retrieval of results is also very similar to Google’s Suggest tool, only with Instant Search you are taken directly to the destination page while Suggest just helps you as create your search query.

Check out Instant Search here

Related:
Google Suggest
Asynchronous Javascript + XML (AJAX)

Google Unveil Blog Search

September 14th, 2005


Google Blog SearchUsers of the woefully slow, spam-ridden Technorati will be happy to learn that Google have just announced their first foray into the world of blog search.

Blogger.com users won’t have missed the prominent addition of Google Blog Search on the dashboard page that allows you to “Search blogs from all over the web”, not just Blogger-hosted ones.

The site has that user-friendly feel seen across Blogger.com and is as fast as you’d expect from Google, featuring the usual keyword search as well as two advanced options beside each result that let you find phrases within the pages of a specific blog, or view all of the pages from a blog.

It also includes a tool to find references to individual pages, one of the features I use the most on Technorati and Bloglines.

The fact that the index for the search engine is limited to blogs published after March of this year show that Google Blog Search is still in beta status, but Google will likely tweak and improve it as they go along.

Google Blog Search Beta on Blogger.com
Main Google Blog Search Site
About Google Blog Search

Related:
BBC: Google unveils blog search site
Blogger Buzz: Explore Blogs
Google (Finally) Offers Blog Search

Samsung Proclaim “Digital Paper Age” With New Flash Chip

September 13th, 2005


Samsung have made a bold announcement that the world is set to enter a “digital paper age” with the introduction of a 16 gigabit flash memory chip.

According to the company, devices like CDs and hard disks are set to disappear as flash memory dominates the information age:

.
“In the same way that civilization rapidly progressed after paper was invented 2,000 years ago, flash memory will serve as the ‘digital paper’ to store all kind of information from documents to photos and videos in the future. Mobile storage devices like CDs and hard disks will gradually disappear over the next two or three years, and flash memory will dominate the information age,” [Samsung CEO Hwang Chang-gyu] said.

Using nano technology, Samsung has doubled the density of memory chips for six years running, ever since developing a 256 Mb chip in 1999. Fifty nanometers is as small as one-two thousandth of a human hair. A 16 Gb flash memory chip made with the technology contains 16.4 billion transistors in a thumbnail-sized space.

Samsung Develops 16 Gb Flash Memory Chip

Related:
Samsung’s Digital World: Semiconductor - Flash

7 Editions of Windows Vista?

September 12th, 2005


Microsoft Windows VistaWindows IT Pro is carrying an exclusive from the Microsoft Professional Developers Conference that the next OS from the software giant is to come in no less than seven different flavours.

The editions will fall into two general categories - Home and Business, much like Windows XP Home and Pro editions.

The Home category will feature 4 editions, from the Vista Starters Edition aimed at beginner computer users in emerging markets who can only afford a low-cost PC, to Vista Ultimate, a version aimed at high-end PC users such as gamers and technology enthusiasts.

The Business category will consist of Windows Vista Small Business Edition, Windows Vista Professional Edition, and Windows Vista Enterprise Edition.

IT Pro notes that these product names are currently classed as placeholders and as such might be subject to change before their final release, but also say they are unlikely to be renamed.

For a detailed run-through of each edition, check out the article here.

Related:
Official site for Windows Vista
Internet Explorer 7 Logo Revealed
Gaming a First-Class Citizen on Windows Vista

Rumour has Google Acquiring Reuters

September 12th, 2005


ReutersA surprising rumour has hit the blogosphere that Google may acquire the ubiquitous news service Reuters.

The story feels far-fetched, and John Battelle has posted on why he’s treating this one with scepticism:

I mean, what on earth would the Google culture DO with Reuters? Really, I’ve dealt with both companies extensively, I can’t imagine a worse relationship. On the other hand, they might just buy em and run them like an absentee landlord. Odder things have happened…at Google anyway.

Interesting Times…

Related:
History of Reuters
Cerf’s up at Google

MSN First to Implement RSS Search Operators

September 4th, 2005


WebProNews.com is reporting that MSN have included two new search operators to search.msn.com for searching RSS feeds without the need for creating long, complex search strings.

Putting feed: before your search term will return any RSS, RDF or Atom feeds that contain the query, so for example feed:Igniq will return not only the various feeds on Igniq.com but any feed that mentions the word Igniq.

The other operator is hasfeed: and will return any page that links to a feed that contains your search query.

The new operators are unique to MSN and mark the first major search engine to include direct-search for RSS files, but I can’t really see many people actually using them because raw RSS files are messy and awkward to deal with; you have to copy the URL, open your aggregator and subscribe.

Maybe MSN should take a leaf out of Yahoo!’s book and fold RSS into the results in an extremely easy-to-use way. RSS feeds on Yahoo search results appear as a simple “Add to My Yahoo!”, one click and the feed is added to your personal page with no fuss.

MSN have an interesting side-project in development over at Start.com that may be pushed into the spotlight in this way; Start is a web-based RSS reader that has a function to subscribe to RSS feeds straight from a search. Perhaps we might see an “Add to My Start Page” show up on the MSN search results when Start goes out of beta.

Related:
MSN Sandbox - New MSN technologies and prototypes

Jacques Chirac Feels Threatened by U.S. Search Engines

September 3rd, 2005


Jacques ChiracThe Telegraph is running a story about Jacques Chirac’s plan to fund a euro-centric search engine in order to defend against the threat of “Anglo-Saxon cultural imperialism.”

The French prime minister is concerned about the dominance of American giants like Yahoo! and Google in French culture and intends to provide loans to a French group to create a multimedia search engine that will be based on current search technology developed in France and Germany.

The new search engine is the second initiative Chirac has given backing to in response to moves by the search giants across the Atlantic.
When Google announced their Print project to put the knowledge contained in U.S. libraries online, the French government pushed to create a rival digital library of European literature called Project Quaero:

“Culture is not merchandise and it cannot be left to the blind forces of the market,” Mr Chirac said in a speech earlier this year giving the go-ahead for work to begin on [Project Quaero]. “We must staunchly defend the world’s diversity of cultures against the looming threat of uniformity.”

Chirac backs eurocentric search engine

Ballmer: “I’m going to fucking kill Google.”

September 3rd, 2005


Steve Ballmer with Bill GatesA lawsuit Microsoft filed after one of their executives defected to Google has revealed MS CEO Steve Ballmer really, really doesn’t like Google.

Executive Kai-Fu Lee jumped the Microsoft ship for Google in July, prompting MS to file the lawsuit, arguing that Lee was in violation of a one-year non-compete agreement because he could pass Microsoft Search secrets to Google.

Documents made public in the legal proceedings have revealed that, in 2004, another engineer informed Ballmer of his intention to leave for another company.

Ballmer replied “Just tell me it’s not Google’”, when the engineer said it was, Ballmer hit the roof, picked up a chair and flung it across his office, hitting a table.

He then launched an amazing, bilious tirade against Google and their CEO:

“Fucking Eric Schmidt is a fucking pussy. I’m going to fucking bury that guy, I have done it before, and I will do it again. I’m going to fucking kill Google.” ….

Thereafter, Mr. Ballmer resumed trying to persuade me to stay….Among other things, Mr. Ballmer told me that “Google’s not a real company. It’s a house of cards.”

C|NET: Court docs: Ballmer vowed to ‘kill’ Google

Related:
Kai-Fu Lee’s biography page on Microsoft.com
About Steve Ballmer
Seattle Times: Gone to Google: Microsoft sues over exec’s defection (20 July)

Google Readying Massive Free WiFi Network?

August 17th, 2005


GooglenetBusiness 2.0 has an interesting report speculating that Google may have a secret plan to roll out free Wi-Fi access to everyone in America.

The basis for the report is the fact that Google have been buying up literally miles and miles of unused fibre optic cable at bargain-bin prices. Add some super-fast connections it has been snapping up and an interesting rumour is born.

The company has also set up a free Wifi hotspot in San Francisco, serving to throw even more fuel on the fire:

One of the cheapest ways would be for Google to blanket major cities with Wi-Fi, and evidence gathered by Business 2.0 suggests that the company may be trying to do just that. In April it launched a Google-sponsored Wi-Fi hotspot in San Francisco’s Union Square shopping district, built by a local startup called Feeva. Feeva is reportedly readying more free hotspots in California, Florida, New York, and Washington, and it’s possible that Google may be involved. Feeva CEO Nitin Shah confirms that the company is working with Google but won’t discuss details. Google’s interest in Feeva likely stems from the startup’s proprietary technology, which can determine the location of every Wi-Fi user and would allow Google to serve up advertising and maps based on real-time data.

Free Wi-Fi? Get Ready for GoogleNet. via Engadget

Internet Explorer 7 Logo Revealed

August 16th, 2005


The new logo for Microsoft’s long-awaited IE7 has been published on the official IEBlog.

Microsoft have darkened the famous “e”, and added a new golden ring that is meant to make the logo stand out a bit more than the one for IE 6.

The logo above will be used for Windows XP with Service Pack 2, Server 2003 SP1 and x64 versions, while the logo for Windows Vista is not quite complete yet and will sport a slightly different look.

Last month, Microsoft released a beta version of the browser to a select number of developers, but the file soon leaked and can be gotten on popular P2P sites.
Caution should be taken if you want to try the beta however as there is absolutely no support given by Microsoft if any damage is done to your system by the program.

New IE 7 Icon and Logo via Inside Microsoft

Google Ceases Scanning In-Copyright Books

August 12th, 2005


Google PrintGoogle have used their official blog to announce that, because of concerns by authors of copyrighted books, the company will not scan any more in-copyright books for its Google Print programme until November.

The Google Print initiative was launched late last year, its aim to make the wealth of knowledge contained in the libraries of the world available to anyone online for free. The company secured deals with major libraries in the U.S. to send workers in and physically scan books in a massive project that is expected to take years to complete.

The new blog post argues the positive advantages to authors if they participate; namely that potential book buyers can be directed to the official site of the author and that they can earn revenue from their work through the contextual ads that display on the results pages, even if the original book is now out of print.

This is an aspect of the program I hadn’t realised, I haven’t seen any other details on how it works, but presumably authors who are featured are offered an AdSense account when they participate.

The November delay is intended to allow copyright holders plenty of time to send Google a list of exactly which books they want Google to exclude from the programme.

In other words, if Google don’t hear from the authors, the search engine giant will simply presume that they have no problem with having their work displayed in the results.

The AP is reporting that the Association of American Publishers are not happy with this system:

Although the Project will get underway with the digitization of works in the public domain over the next three months, Google’s plan calls for digitally copying every work in the collections of three major libraries unless specifically denied permission for a particular work by the copyright owner. “Google’s procedure shifts the responsibility for preventing infringement to the copyright owner rather than the user, turning every principle of copyright law on its ear,”
Related:
Google Blog: Making books easier to find
About Google Print

Google Stealthily Upping Adsense Earnings to Counter Yahoo!?

August 11th, 2005


It appears that Google have been rattled by Yahoo!’s entry into the contextual ad business and have begun upping the earnings of AdSense publishers.

Jason Calacanis, the man behind the Weblogs Inc. blog empire, has reported on a ~15% increase in his AdSense earnings since it was announced the Yahoo! Publishers Network had gone beta. The YPN is set to become the main competitor to AdSense when it launches by year-end. This is obviously a good thing for AdSense publishers, hopefully it will lead to more competitive benefits, and more transparency on Google’s part. Pretty outrageously, Google currently do not disclose what percentage of the earnings publishers get for displaying ads on their sites.

Google Adsense torpedoes Yahoo Publisher Network with a 15% better split?!?

Yahoo! Messenger 7 Launches

August 10th, 2005


Yahoo! Super MessengersYahoo! have launched version 7 of their Yahoo! Messenger software, which includes a number of new features.

Top billing goes to a new feature that allows worldwide PC-to-PC calls, though I can’t see how this is different from the previous version, which allowed any two users who had a microphone to converse.
What is definitely new however is the ability to “save” the calls, much like an archive can be kept of all the text-messages sent through the program.
Clicking through to Archive in the preferences now gives a list of the calls made to you, and a built-in media player to play them back.

Also included is a drag & drop functionality for sharing files and photos. These were present in a recent beta version of the program, and the photo-sharing aspect is pretty useful, opening a mini-window in the chatbox that lets you see the images you share with friends.

There is also a folding in of the company’s Y!Q inline search product, that lets you search without having to go to the main Yahoo! search engine page. Selecting a word in a conversation window and hovering the mouse over it opens an inline search for the word. This is a pretty useful feature, and a further focusing on search across Yahoo!’s services.

More features include integration with Yahoo! 360, the blogging tool that Yahoo! launched a couple of months ago, and a new set of emoticons that can be viewed here.

Yahoo! Messenger homepage - Flash tour of the new version


eXTReMe Tracker