Crawling alone is not enough to build a search engine.

Information identified by the crawlers needs to be organized, sorted and stored so that it can be processed by the search engine algorithms before being made available to the end user.

This process is called Indexing.

Search engines don’t store all the information found on a page in their index but they keep things like: when it was created/updated, the title and description of the

page, type of content, associated keywords, incoming and outgoing links, and a lot of other parameters that are needed by their algorithms.

Google likes to describe its index as the back of a book (a really big book).

Why care about the indexing process?

It’s very simple, if your website is not in their index, it will not appear for any searches.

This also implies that the more pages you have in the search engine indexes, the more your chances of appearing in the search results when someone types a query.

Notice that I mentioned the word ‘appear in the search results’, which means in any position and not necessarily on the top positions or pages.