Back to Basics: Search Engine Optimisations
Melchor Tatlonghari
2023-02-11
Google Search Engine Algorithm: Page Rank
How it works
Google employs a number of bots, known as Googlebots or spiders, to crawl the entire internet and index web pages. The frequency and speed of crawling is determined by the performance and schedule of the bots.
Google’s PageRank algorithm assigns a score to each web page based on the number and quality of external links pointing to it. Each external link to a page is seen as a vote of confidence in that page’s content. Not all votes are counted equally, however. Links from more authoritative and trustworthy websites are given greater weight, while links from spammy or low-quality sites can actually harm a page’s PageRank score. Though there are definitely other several factors involved, like advertisements and user prefernces, this is an oversimplification of how pages are ranked.
Each search engine have their own different implementations on how to rank and search the internet which explains why different search engines often times return vastly different results. Also worth noting PageRank is not the only search algorithm Google employs but is probably one of the best known.
Suggested Reading: https://www.google.com/intl/en_au/search/howsearchworks/how-search-works/ranking-results/
Photo by Merakist on Unsplash
What problem it was trying to solve for
Before PageRank, website owners could simply add meta tags to their pages to try manipulate search engine rankings. This resulted in poor quality search results and frustrated users. (This is akin to asking a barber “Should I get a haircut?” The answer will always be in favour of the barber!). PageRank was designed to provide more accurate search results by using a more objective measure of a page’s authority and relevance.
SEO and Web Crawling
You can’t discuss SEO without understanding Web Crawling and Indexing. Crawling refers to the process of a search engine bot visiting a web page and analyzing its content, links, and other elements. Indexing refers to the process of adding a page to a search engine’s database or index. In other words, crawling is the first step in indexing a web page.
Crawling is a procedure whereby bots traverse through all the nested URLs within a given website. For instance, when crawling https://medium.com, the bots will go through each and every individual user’s profile until they end up at for instance my profile: https://medium.com/@meltatlonghari3. My profile then contains links to my personal site: https://melchortatlonghari.com which in turn, has other links to my social media handles and so on. Imagine this happening recursively for all the sites that exists.
During crawling, the bots register all the URLs they encounter in a queue for the Indexer to visit later on. It is important to note that crawling does not index while it is happening. The task of crawling is complex as it involves avoiding repeating crawls on previously visited sites. Moreover, crawling the entire internet requires significant processing power.
Indexing and how Google returns millions of results in milliseconds
Have you ever wondered how Google manages to provide you with millions of search results in a matter of seconds? Well, it’s all thanks to their crawlers and indexing process.
Let’s break it down. Firstly, the crawler goes through all the URLs in its queue, visiting each website and collecting information. This requires a lot of processing power, but it’s worth it because this information is then stored in the Google Search Index.
Now, here’s where things get interesting. The bot is able to categorize each website based on its content and structure using HTML tags and Meta tags. This means that when you perform a search on Google, the results have already been indexed beforehand — so the search process is super fast!
Google also uses indexing to filter out potentially harmful links and sites from appearing in your search results. So not only are you getting quick results, but you’re also protected from any potential threats online.
All in all, indexing is a crucial part of how Google operates and provides us with such an amazing search experience. So next time you use Google to find something, remember that their indexing process has already worked hard behind the scenes to provide you with those results!
Suggested Readings:https://www.callrail.com/blog/what-is-crawling-and-indexing and https://developers.google.com/search/docs/crawling-indexing (+1 page rank to callrail.com)
Service Side Rendering versus Client Side Rendering
When it comes to rendering a website with the latest technology in software engineering, it’s important to keep in mind how it affects searchability. Different rendering techniques could render your site unsearchable by how you opt for your site to be delivered to users.
There are two different approaches to rendering web pages: Service Side Rendering (SSR) and Client Side Rendering (CSR). SSR generates HTML on the server and sends it to the client, while CSR generates HTML on the client-side using JavaScript.
However, CSR can sometimes be more challenging for search engine bots to crawl and index because the content is generated dynamically after the initial page load. This means that when a bot crawls your site, it may not be able to see all of your content. In contrast, SSR sends a fully rendered page to the client, making it easier for search engines to crawl and index all of your content.
It’s crucial that you consider how you want your website to be indexed by search engines before deciding which rendering approach to take. If you want your website’s content to be easily discoverable by search engines, SSR might be the better option for you.
Remember, creating a great website is only half the battle — making sure people can find it is equally important!
Tools of the Trade
Here are some tools that you employ to use to help you improve searchability:
Chrome Plugins: View Rendered Source and User-Agent Switcher
These tools allow you to “act” as Google bot when visiting your site. Allowing you to see how the bot would see your site and make you make adjustment in order for your site to be searchable. Important to note here that it will also allow you to see how your site fares out given a rendering technique you employ (SSR versus CSR)
Robots.txt
Robots.txt is a text file that website owners can use to instruct search engine bots which pages to crawl and index, and which pages to ignore. The file is typically placed in the root directory of the website.
Screaming Frog
Screaming Frog is a popular SEO tool that allows users to crawl websites and analyze their content, structure, and links. To dry run crawling your site in Screaming Frog, you can set the “Spider Mode” to “List” and input the URLs you want to crawl. This allows you to see how the tool will crawl your site without actually running a full crawl.
Google site query
To test how much Google knows of your site, you can use the site: operator in a Google search. Simply type 'site:yourwebsite.com' into the search bar and hit enter. Google will display all of the pages from your site that it has indexed. If some of your pages are missing, it may be an indication that there are crawling or indexing issues that need to be addressed.
In conclusion, SEO is a critical component of any successful online marketing strategy. Understanding the basics of how search engines like Google operate, and implementing best practices like creating high-quality, relevant content and optimizing your site’s load speed and mobile responsiveness, can help improve your search rankings and drive more traffic to your site. However, it’s important to remember that SEO is an ongoing process that requires regular updates and maintenance to stay effective, and it should be integrated with other digital marketing tactics to maximize your overall results. By staying up-to-date with the latest trends and best practices, and committing to a long-term SEO strategy, you can improve your online visibility and grow your business in today’s digital age.
© Melchor Tatlonghari. All rights reserved.