Search engine optimization, a practice commonly known by the acronym SEO, plays a fundamental role in the ranking of a website for the keywords relevant to it. To do SEO right, however, it is important to understand for which operating logic we are optimizing our contents and what are the processes at the basis of SEO. While it is not known exactly how search engine algorithms work, there are some key principles and concepts that are important to know. Let’s see which ones.
Anyone involved in marketing knows the importance of optimizing their website for search engines. We know on-page and off-page SEO, white hat SEO and black hat SEO, the best practices dictated by Google. However, the processes that dominate the functioning of search engines are not always well understood.
However, knowing how and why these processes take place is important to optimize our website in the best possible way. In fact, if it is true that Google and other search engines do not make money through organic results, it is equally true that if the latter were not there, the paid results (the direct revenue of the search engines) would make no sense: it is organic search that pushes a user to use a browser. This is the reason behind Google’s effort to indexing and ranking the most relevant content based on user queries and, consequently, why SEO plays a fundamental role in marketing.
Therefore, we must ask ourselves what, exactly, are we optimizing our site for? How does the system for which we do SEO work?
In this article we will try to answer these questions, delving into some key concepts that underlie the functioning of search engines. The article mainly examines Google, although the principles we will discuss are valid for any search engine.
Table of contents
Crawlers and the crawl budget
Indexing refers to adding content of a web page on Google. Usually, to make this happen, we don’t have to do anything: Google’s crawlers continually scan the web to find new content that can answer users’ queries in the most relevant way possible.
However, it is possible to facilitate the indexing process by providing Google with an XML Sitemap, which is a list of pages of our website through Search Console. Furthermore, via Search Console we can also explicitly ask Google to index new pages and pages to which we have made important changes.
Having said that and assuming that there are more than 1 billion websites, many of which with millions of pages each, updated daily, it is natural to wonder how Google manages such a large amount of data. Well, the indexing of every available URL is beyond Google’s capabilities, as Google itself admits. For this reason, crawlers do not work blindly but through a crawl budget, which indicates the amount of time and resources that Google allocates to crawl a site. The budget assigned to each site is determined by a number of factors, including the speed of the server hosting the site (Google does not want to worsen the user experience due to its scans) and the importance of the site. A news site like that of the BBC will surely be crawled more often than a restaurant site.
Indexing and rendering
When you think about the ranking of a page, it is often associated only with indexing.
Indexing followed by rendering helps Google understand even better how to rank a page within the SERP.
How search engine algorithms work
When we talk about Google’s algorithm, we could make the mistake of thinking of it as a single process full of formulas and data. Actually, it is more correct to speak of Google’s algorithms, using the plural: Google has a myriad of algorithms, each of which fulfils specific functions, collected in a “core algorithm” that ranks the results.
For example, one of Google’s most famous and SEO-relevant algorithms is Panda, which evaluates, penalizes and rewards content based on specific characteristics. But under the aegis of the Panda algorithm there are several other algorithms. Therefore, we have to think of search algorithms as a large collection of other algorithms and formulas, each with its own purpose and task, which all work together to ensure an optimal user experience.
How does content ranking work
To understand how content ranking works, it is good to keep one factor in mind: Google, like all other search engines, offers a service to the user. Our content must satisfy the user, who is primarily a Google customer and only secondarily our own potential customer. If our content is not deemed relevant, exhaustive and capable of answering the user’s query, Google has no interest in showing it. Therefore, it goes without saying that beyond any analysis (and speculation) of how the processes that govern search engines work, at the base of everything there must always be the content quality, as every self-respecting SEO strategy entails.
In an interesting report, Dave Davies of Search Engine Journal, based on his twenty years of experience in the field, identified 5 phases in the ranking process of a page. Here they are.
The first step of the process is to classify the user’s query. Query classification gives the search engine the information it needs to perform all of the following steps.
It is not known which and how many classifications there are but Davies hypothesizes they may be, for example, Local, Demand, YMYL and so on.
The second step is to assign a context to the query. To do this, when possible, the engine draws on other relevant information it has about the user. This happens, for example, when we type “weather” as a search query: Google already knows the weather of which city I’m referring to.
So, at this stage the search engine tries to understand the historical and environmental context of the query. Therefore, it is likely that the search engine will take into consideration the location, the time, the device we are using, if we have already made that query, if the query is relatable to others, etc.
There are certain search queries that lend themselves to a twofold (to be optimistic) interpretation. Let’s take the query “pandemic”. Those looking for this term may want to look for general information on pandemics in history or simply to know what a pandemic is. At the same time, as we are in fact in the midst of the covid-19 pandemic, the intent of the user’s query could be to read the latest news about it.
The same goes for several other search queries, which can take on a current connotation depending on the events.
The task of machine learning algorithms such as RankBrain is precisely to evaluate which results may be most interesting for the user, based on the query and context elements.
The layout of the SERP changes depending on the query. There may be featured snippets, an in-depth box on the right, a box dedicated to related queries, a box dedicated to related videos and so on. In short, Google goes to great lengths to provide us with as many relevant results as possible. But how does it determine what to show us each time?
Davies tried to answer this question by assuming that, when a query is run and the first three steps are completed, the search engine will refer to a database of the various possible elements to be placed on the page, their possible placement and then determine which one will apply to the specific query.
The fifth step is actually the simplest. Based on the information collected in the previous steps, search engines analyse the various sites that can be considered for ranking, putting algorithms to work to determine the order in which the sites should appear in the SERP. To do this they will have to evaluate each element of the page.
The role of User Behaviour
The user behaviour is also taken into consideration by Google to understand which results are of greatest interest for that query. In fact, Google is able to know:
- which sites in the SERP are clicked
- how much time the user spent on the site clicked before returning to the SERP
- what the user does after returning to the SERP
This opens the scenario to four possible scenarios.
If a user clicks on a site but exits it almost immediately, this will be a signal to Google that the proposed result is not relevant to the query. If, on the other hand, a user clicks on a site, stays there for some time, then returns to the SERP and clicks on another site, Google will receive a positive signal of relevance from this behaviour as it is likely that the user has found the first result interesting and is simply looking for more information.
If after consulting a result the user returns to the SERP and modifies the search query, this will signal to Google that the user was probably not precise enough in their search and therefore this will have no impact on the relevance of the visited website.
If, on the other hand, the user clicks on a site and then returns to the search engine to make a completely different query, then this translates into a positive sign of relevance, as it is likely that the user has found what he was looking for and is now searching for something else.
Nobody knows exactly the rules that dominate search engines and their algorithms. What Google gives us is a list of good practices and useful tips for providing the user with content relevant to its query.
However, observing – as users – how Google behaves can help us to draw important conclusions about the most successful contents and the logic of their ranking.
The Google SERP itself, which changes appearance based on the individual search of the individual user, actually hides many opportunities: by observing it we can understand what Google considers a topic related to a specific search; what videos, tweets or other media it shows for that query; what other queries are users doing in relation to a particular keyword.
What we must never forget is that the main goal of a search engine is to help users complete a task. Therefore, all of our content creation and SEO efforts must be geared towards the same goal if we really want to be successful.
Did you find this article useful? Has observing and analysing the SERP ever helped you in creating SEO content? Tell us your experience in the comments and contact us to learn about our digital marketing services.
One thought on “How search engines work”