Laughing Collie
Productions

Home | Need us? | About | Solutions | Resources | Contact

Useful Resources

Newsletter | FAQ | Layouts | Links | Accounting

Newsletter Archives: Search Engines

How do search engines work? How do I find my site on a search engine?

Today, as I write this, is June 23, 2003. If you were using your computer today, and were connected to the Internet, you could have visited the "home page" of one of the best known search engines at Google Technology, Inc. web site. At the bottom of the page, there is this note:

Searching 3,083,324,652 web pages

Every time you typed some word in the box on that page and used the "Google Search" button, the computers at Google would comb through the index of all those pages and see if the words you typed appear on any of those pages. If any matching pages were found, the program would try to put them in a plausible order, and show you what it found.

How Does A Search Engine Find Out About My Web Site?

Before any of this can occur, Google Technologies first has to collect all those pages from individual sites from all over the Internet. Google has whole banks of computers running programs that do nothing but visit web pages and record the results, twenty-four hours a day, seven days a week. These programs are called 'bots, spiders, crawlers, and other euphemistic names. Periodically, they visit the home page of a site and follow every link they can find.

The scope of this task is quite large. Remember that quote from the Google front page? "Searching 3,083,324,652 web pages." If we assume for a moment that Google tries to visit every page once a month, to check up on the contents (or existence) of every page, they have to visit over 100 million pages a day. That's about 1200 per second, every second of every day, every day of the year.

New sites appear on the Internet continuously. No one really knows how many sites there are, or how quickly the number is growing. For a search engine to find a new site, someone must send the name to the search engine, or their web crawlers must find a link to the new site on a site they already visit. Many search engines now require a fee to submit a new site to their list and, in exchange, promise to index those sites at the earliest opportunity. If no fee is paid, these search engines will eventually find new sites, as they encounter links to the new site or when they share lists of sites with other search engines.

Exactly when, how often, or any of the details of how a specific web crawler decides to visit a site are kept quite secret by all of the search engines. According to Google's web site, their web crawler attempts to visit every site on their list at least once "every few weeks." If you are familiar with how your web server records requests for individual pages, you can check and see how often Google visits your site.

Keep in mind there are numerous search engines, and each of them has to visit your site and examine all your pages. You can find a list of search engines in any search engine. :-) Just off the top of my head, I can name at least a dozen, and there are certainly many more:

  • http://search.netscape.com
  • http://www.alltheweb.com
  • http://www.altavista.com
  • http://www.ask.com
  • http://www.dogpile.com
  • http://www.excite.com
  • http://www.freefind.com
  • http://www.go.com
  • http://www.google.com
  • http://www.hotbot.com
  • http://www.lycos.com
  • http://www.northernlight.com
  • http://www.search.com
  • http://www.webcrawler.com
  • http://www.yahoo.com

How Do People Find My Site In A Search Engine?

A search engine lets visitors enter a list of words or phrases (a query), searches through a collection of data for items that match, and displays a list of the items it found (a result), sorted by relevance.

Every search engine may treat a query differently, and frequently will let visitors make a query more specific in some way. For example, entering a query for web pages about "chocolate" and "candy" might show a list of pages where the words "chocolate" OR "candy" appear. A query for pages containing the phrase "chocolate candy" would only show a list of pages where that exact phrase appears.

Search engines will order the list of results by "relevance" -- which is a loaded term at best. How a search engine decides which pages in the list are the most "relevant" to a specific query is two parts mad science, two parts black magic, one part art, and one part craft. The actual mathematical processes the various search engines use are some of the most closely guarded secrets about the Internet.

Who Decides What Pages Are Shown?

Initially, search engines reported any page that matched the query, and the more often the words in the query showed up, the more "relevant" the page ranked. If the page had the word "chocolate" or "candy" many times, it must be very relevant, right? Very quickly, web sites that wanted to be highly ranked for any search began "salting" their pages with words they thought were important. Frequently, these repeats were hidden in some way, so visitors couldn't see them, but they would be found by the web crawlers used by search engines. Imagine a page with the word "chocolate" one thousand times at the bottom of the page, in the smallest font possible and the same color as the background.

Some sites put complete (but invisible) dictionaries on their pages so their pages would match every search. Before search engines figured out how to filter these pages out, every search resulted in a certain percentage of "adult" and "multi-level marketing" sites.

As time passed, the criteria used to rank a page became more complex and, depending on the search engine, might include how often specific words appear in the page, in the title, or in the keywords assigned to the page. If the search terms occurred "close" to each other on the page, or only in a certain part of the page, the ranking might change as well. Ranking might also include what other pages link to a page, and what their ranking is for the query, what links are on the page, and a whole host of other factors.

Please keep in mind none of this can be verified with the search engine companies. If any search engine published the rules used to decide how pages are ranked for relevance, many people would modify their pages (frequently in strange and bizarre, but invisible ways) in order to show up more often or ranked more highly.

Paying For Placement

One way to make sure your site shows up for specific searches is to pay for placement. Many search engines will put a link to your site right up at the top of the page, if you pay them to. Using our example again, if you search for "chocolate" on Google (today, at least) you will see a list of boxes on the right-hand side of the page. Each of these people has paid to have their site put on every result when chocolate is searched for.


back to the
Collie's Notes Samples page
or the FAQ page



Home page | 408 / 559-1536 | Talk to us!