14. Spidering Indexing and Meta Tags
// June 27th, 2009 // free seo book
Spidering
The act of crawling the web by the search engine robot is called spidering. The spider’s job is to visit a web page, to crawl its content, to follow its external links and to retrieve all the information to the search engine’s index. The web crawler travels from on page to another by following hyperlinks. During this process, the spider follows several parallel paths at the same time.
Web crawlers periodically return to the pages they have indexed, but the intervals may vary from a month to several months. Thus, any changes that have occurred during this period could be reflected in the index results too.
Search engine’s robots automatically crawl the WWW and build up their listings. It is very useful for you to know what factors affect the so-called “deep crawl”. Deep crawl refers to the depth the robot will go into from the web page it has visited firstly.
Submitting a website to a search engine could seriously increase your chances to be listed in its index. While crawling the Internet, the robot stores the indexed pages in its memory, but the key is in the process of indexing. You can think of the search engine’s index as a huge database that contains all the information that web crawlers collected. The spider doesn’t index the whole page. The algorithm, which is responsible for the search process and the page ranking, is applied only to the index that has been created.
Indexing
The majority of search engines assert that their spiders index the whole visible content of a certain page. In the next chapters we’ll share with you the key factors that you’ll need to keep in mind in order to ensure that indexing of your web pages improves relevance during search. It’s important for every Webmaster to understand the indexing and the page-ranking process, because this knowledge will help him develop the right strategies.
Meta Tags
The META description and META keywords tags play an important role, because they are indexed in a more special way. Presently, some of the most popular search engines don’t always index the META keywords, as they consider some of them spam. Have in mind that some of the so-called “stop words” (“a”, “the”, “of”) are not indexed too, so you’d better skip them when entering your META keywords. Spiders don’t index images, but they do index their descriptions, which are called Alt texts.
When somebody makes a search on a keyword or phrase, the search engine spider searches its index for relevant information. Then the searching software returns a report to the searcher, which contains the most relevant web pages listed in a descending order. More details about these processes based on algorithms will be discussed in the next chapters of this free SEO book.
Directories of this kind compile lists of websites into typical subject categories which include a short website description. In order for your website to be included in a particular directory, you need to require a submission to this directory first. Using the “Add URL” option can do this. People often visit directories in order to locate relevant sites and information sources. Therefore, directories support structured search.
Crawler-based search engines very often find new sites for indexing by crawling the most popular directories on the Internet first. Yahoo! and DMOZ (ODP) are the largest and most reputable web directories in the WWW. Lycos, for example, is one of the first search engines, which transformed into directories and depend on AlltheWeb.com for its listings.




