Google crawling
how do web pages get into Google, and who/what determines the order in which the various web pages appear when a user does a search on Google?
Periodically, Google crawls all the web sites that are listed in www.dmoz.org, and any web sites that are linked to those web sites. For example, if www.MyTerrificwebsite.com is listed in www.dmoz.org, Google will crawl that web site. If www.MyTerrificWebsite.com has a link to www.MyEvenBetterWebsite.com on it, Google will crawl www.MyEvenBetterWebsite.com also. Once Google crawls each web site, it caches the web site.
As Google crawls each web site, it scans the HTML code on each web page. Google then takes all this information, and runs it through the algorithm used to determine Page Rank, and placement within the search engine results. Google and other search engines each have an algorithm they use to determine which pages should rank high, and which shouldn't within the search results. This algorithm is a logical decision tree which looks for certain key factors to determine search engine ranking.
What is this algorithm? None of the search engines reveal this information because people would write web pages only to rank well in the search engine rankings, instead of creating web pages with good content. Ultimately, the search engines want the most relevant search result to appear when users search on various topics.
Google constantly fine tunes its algorithm (modifying it such that it can determine how to place the most relevant search results at the top of its search engine results). Periodically, Google runs this algorithm on all the web pages it has cached. This is what some SEO (Search Engine Optimization) folks call the "Google Dance". In the past, Google used to update its index on a somewhat monthly basis, and the 'Google Dance' was a noticeable flux in the Google database. Recently, Google switched to more of a constant update process. This means the index is updated in smaller increments, versus all at once, and there's no obvious Google Dance. While the excitement of the Google Dance is gone (No, I don't have a life), I've noticed additions or changes to web sites are getting indexed quicker than in the past.
Periodically, Google crawls all the web sites that are listed in www.dmoz.org, and any web sites that are linked to those web sites. For example, if www.MyTerrificwebsite.com is listed in www.dmoz.org, Google will crawl that web site. If www.MyTerrificWebsite.com has a link to www.MyEvenBetterWebsite.com on it, Google will crawl www.MyEvenBetterWebsite.com also. Once Google crawls each web site, it caches the web site.
As Google crawls each web site, it scans the HTML code on each web page. Google then takes all this information, and runs it through the algorithm used to determine Page Rank, and placement within the search engine results. Google and other search engines each have an algorithm they use to determine which pages should rank high, and which shouldn't within the search results. This algorithm is a logical decision tree which looks for certain key factors to determine search engine ranking.
What is this algorithm? None of the search engines reveal this information because people would write web pages only to rank well in the search engine rankings, instead of creating web pages with good content. Ultimately, the search engines want the most relevant search result to appear when users search on various topics.
Google constantly fine tunes its algorithm (modifying it such that it can determine how to place the most relevant search results at the top of its search engine results). Periodically, Google runs this algorithm on all the web pages it has cached. This is what some SEO (Search Engine Optimization) folks call the "Google Dance". In the past, Google used to update its index on a somewhat monthly basis, and the 'Google Dance' was a noticeable flux in the Google database. Recently, Google switched to more of a constant update process. This means the index is updated in smaller increments, versus all at once, and there's no obvious Google Dance. While the excitement of the Google Dance is gone (No, I don't have a life), I've noticed additions or changes to web sites are getting indexed quicker than in the past.