Tuesday, 3 March 2015

How Search Engines Work

In order to assess how search engines work from the viewpoint of how they rank sites we need to examine three areas, how they find pages, what they look for on a page and how they compare different pages against a users search criteria.

Search engines find pages in one of two methods. They generally have an ADDURL link on the home or Help page which allows users to submit either a web page or their siteSome engines ask you to just submit the domain ie. http://www.yourdomain.com/ while others allow individual page submissions. Always read the submission guidelines before submitting as doing it wrong may get a site banned.

Once a page has been submitted, the search engine uses a software SPIDER to look at the site. This program extracts different pieces of information from the site such as MetaTags content, the text on each page, text contained in comment tags, image alt tags and form tags. Each search engine looks for the information they require and each is different. They also look at links on each page and may add those links to their database for spidering at a later date. Spiders prefer text links rather than image maps and redirected links such as those used in redirection scripts. Any links with variable identifiers such as ? will not be followed as these could lead the spider into infinite loops within the site, or to hundreds of different versions of the same page.

The search engine spider examines the code on the page and extracts text from the programming code. The text is then examined to assess the theme of the page. They look at which words appear regulary throughout the page. Words appearing in Metatags, link anchor text and emphasised text such a bold or italic words on the page. These give the engine an indication of the overall theme of the page so that a search for 'cars' will bring back lots of pages with cars appearing in them.

After matching the users search query with those pages in the search engine database, it has to decide which pages are most likely to be of use to the surfer. Each search engine has it's own ALGORITHM or mathematical calculation which gives more importance to words appearing for instance in Metatags than words on the page. Each engine is looking for what they believe is the best match for the user. By grading each page according to their algorithm the engine is able to decide that page A is a closer match than page B for this user.

Engines also look at off-page criteria such as the number of links pointing to a site and whether those linking pages are also relevant to the search. Other factors include the age of the page and whether it is listed in edited directories such as Yahoo and Looksmart.

In order to achieve top ranking pages it is necessary to reverse engineer the algorithm used by each engine by examining top ranking pages in popular searches. So by looking at the top 20 sites for a phrase such as 'LOANS' a pattern will emerge which may give an indication about the different factors that search engine is looking for. Factors such as the number of words on the page (word count). How frequently the keyword appears on the page (keyword density) and how near the start of the page the keyword appears (keyword prominance). By examining more searches and more pages the pattern of results will become clearer.

Unfortunately, some sites are able to hide the real code used by delivering different pages to search engine spiders than the normal user. They achieve this by examining the IP address and User agent of the visitor before serving an appropriate page. A high ranking page may also be swapped for a differently coded page as soon as it appears at the top of the search result by switching the page. Therefore be careful that the page you look at is in fact the same page that got the top position. You will often be able to spot this because the description on the search engine may appear different to that on the page.

Pages which contain little text because of the use of images or flash animation are unlikely to do well on search engines as they give the spider little to read and therefore little to assess what the page is about. Search engines cannot read text contained within an image or animation, Similarily they struggle as words become more deeply buried within lots of tables. Your web site designers may have created a fabulous looking site, but is it search engine friendly?

Text is king for the search engines. Anything which gets in the way of descriptive text about your products will affect the position achievable on the engines. A search engine friendly site consists of plain text with the targeted phrases repeated throughout the page. However, compromise is always necessary in the design. Some techniques though demolish any chance of a top ranking site and the sales this would produce.

It is often worth considering creating a text only version of the site to run alongside the main site so that search engines have a chance of picking up the site content. This should be designed as if for a text only browser such as Linx. Try viewing a page from your site at http://www.delorie.com/web/ses.cgi to see how it looks to a search engine, you may be surprised.


No comments:

Post a Comment