My guess is that in Aug 25, 2003 Google's index was full. Why do I say this? . . .
. . . The database was constructed using a Document_ID that is associated with
each Web page. This document_ID was published as being a 4-byte unsigned
long integer. This means that for every single Web page Google has in their
index, an ID was created to identify this Web page. But like everything,
there is a limit and a 4-byte unsigned long integer has a maximum value of
4,294,967,296. So if no changes are made to their database structure, it
would mean Google has probably reached this threshold. And as new pages are
added, old pages are removed (disappear). Quite alarming isn't it?
It's puzzling, no one except Google knows how they manage their index and ranking, all anyone can do is observe the end product and draw conclusions and theories from that.
I've noticed myself that if you search for "the" in Google you'll get far more results than the limit posted on the Google homepage. Maybe Google's index is much larger than anyone thinks and the homepage number is a low priority to them as long as no one has a higher one (Google increased the number in 2003 just days after AllTheWeb.com posted a record index of 3.2 billion pages).
Comments:
Post a Comment




