Monday, 5 November 2012

How Search Engine Operates


File:WebCrawlerArchitecture.svg



Search engines have two major functions - crawling & building an index, and providing answers by calculating relevancy & serving results.


Crawling and Indexing

Crawling and indexing the billions of documents, pages, files, news,videos and media on the world wide
web.

Imagine the World Wide Web as a network of stops in a big city subway
system.

Each stop is its own unique document (usually a web page, but sometimes a PDF, JPG or other
file). The search engines need a way to “crawl” the entire city and find all the stops along the way,
so they use the best path available – links.

Search engines have two major functions - crawling & building an index, and providing answers by calculating relevancy & serving
results.

1. Crawling and Indexing
Crawling and indexing the billions of documents, pages, files, news, videos and media on the world wide web.

2. Providing Answers
Providing answers to user queries, most frequently through lists of relevant pages, through retrieval and
rankings.

“The link structure of the web serves to bind all of the pages together.”

Through links, search engines’ automated robots, called “crawlers,” or “spiders” can reach the
many billions of interconnected documents.

Once the engines find these pages, they next decipher the code from them and store selected pieces
in massive hard drives, to be recalled later when needed for a search query. To accomplish the
monumental task of holding billions of pages that can be accessed in a fraction of a second, the
search engines have constructed data centers all over the world.

These monstrous storage facilities hold thousands of machines processing large quantities of
information. After all, when a person performs a search at any of the major engines, they demand
results instantaneously – even a 1 or 2 second delay can cause dissatisfaction, so the engines work
hard to provide answers as fast as possible.

Search engines are answer machines. When a person looks for something online, it requires the
search engines to scour their corpus of billions of documents and do two things – first, return only
those results that are relevant or useful to the searcher’s query, and second, rank those results in
order of perceived usefulness. It is both “relevance” and “importance” that the process of SEO
is meant to influence.

To a search engine, relevance means more than simply finding a page with the right words. In the
early days of the web, search engines didn't go much further than this simplistic step, and their
results suffered as a consequence. Thus, through evolution, smart engineers at the engines devised
better ways to find valuable results that searchers would appreciate and enjoy. Today, 100s of
factors influence relevance, many of which we'll discuss throughout this guide.

How Do Search Engines Determine Importance?

Currently, the major engines typically interpret importance as popularity – the more popular a
site, page or document, the more valuable the information contained therein must be. This
assumption has proven fairly successful in practice, as the engines have continued to increase
users’ satisfaction by using metrics that interpret popularity.

Popularity and relevance are not determined manually. Instead, the engines craft careful,
mathematical equations – algorithms – to sort the wheat from the chaff and to then rank the
wheat in order of tastiness (or however it is that farmers determine wheat’s value).

Google recommend the following to get better rankings in their
search engine:

1. Make pages primarily for users, not for search engines. Don't
deceive your users or present different content to search engines
than you display to users, which is commonly referred to as
cloaking.

2. Make a site with a clear hierarchy and text links. Every page
should be reachable from at least one static text link.
Create a useful, information-rich site, and write pages that
clearly and accurately describe your content. Make sure that
your <title> elements and ALT attributes are descriptive and
accurate.

3. Use keywords to create descriptive, human friendly URLs.
Provide one version of a URL to reach a document, using 301
redirects or the rel="canonical" element to address duplicate
content.

5 comments:

Unknown said...

This is it!!! I was looking for this type of information from so long. Thanks for great information you write it very clean.
seo Brighton

PRAVEEN KUMAR said...

@Susan, Many Thanks for your feedback...

John Garet said...

thank you

http://SamuraisSEO.com

InnomaxMediaLLP said...

You can learn how search engine operates. Useful post

SEO company Singapore

Guaranteed SEO India said...

The technical, useful and precise process for budding entrepreneurs to grow their businesses on internet.

Outsourcing SEO

Post a Comment