Googlebot, in simpler terms, is nothing but Google web crawlers. These are used by Google to read and analyze every piece of content that people put up on the internet. These crawlers are of two types: For smartphones and for desktops. Based on the conclusions of Googlebot, Google decides if a site must be indexed or does it need more work and polishing.

Every site on Google is analyzed by both these crawlers. However, both of them reads a site on the basis of user agent token, so it is not possible to target anyone.

Seeing the First 15MB of Html Content

In the month of June 2022, Google made some significant changes to its Googlebot help document. It bought to people’s notice that its bot does not see content further than 15Mb, and on the basis of the same, it decides to index a page or not. Although people got to know about it now, Google also stated that it has been following this protocol for years. It just never showed its direct impact on the sites because almost all the sites are of the lesser site. 15MBs of HTML text is a lot more than people actually think.

If so, then why has Google added this to its documentation now?

Even though Googlebot was operating this way, Google just didn’t add it to its documents. It is done now so that people can start paying more attention to their content. Instead of lengthy irrelevant documents, this will make people come to the point in the most appropriate way without stuffing their articles unnecessarily.

Its Impact On SEO

Ever since the word has been out, people have been losing it. Especially people working on Search Engine Optimization have a lot of questions. Continue reading to find answers.

Before understanding that if this new addition has an impact on the SEO, it is necessary for people to know that the 15MB that Google mentioned here includes text only. Anything other than text, be it images, videos, javascript, or anything else, is read separately by the bot.

According to Google, most of the sites present on the internet are within the 15MB HTML size. However, for sites bigger than that, Google suggests people add CSS dust or run inline scripts in the first 15MB of the site. In addition to that, it is also important the content in the 15MB is SEO-friendly and meets the norms.

Check Your Site Size

In case you are not sure about the size of your site, here are various tools in the market that you can use to analyze the same.
    • Screaming Frog SEO Spider
    • Google Page Speed Insights

Both of these tools can give you a clear idea about the size of your page. In addition to that, they also crawl through the entire site for you.

The ideal size of a page, according to SEO best practices, is 100Kb or less. However, it can go up to 150-200KB in the case of an eCommerce page.

Its Impact On Your Site Performance

If you have a site that is of more than 15MB, then you definitely need to work on it, make some edits, adjust some text, and bring it to the ideal page size. Any site of more than that will have a direct impact on the 15MB Googlebot crawling documentation. Generating a page that is unnecessarily stuffed and holds a lot of non-informative information will never get indexed by Google.

However, making a page that is too small is also of no use as it will lack complete content. In fact, removing content just to meet the ideal size without giving it a thought can result in a decrease in your engagement. It is all about the right balance, rather than too much or too less.

How Googlebot Crawls Your Website

Now that you have the 15MB thing clear in your mind let us take you through some of the basics if you are new to this. First of all, let us introduce you to the process through which Googlebot accesses your website and crawls over its Html. Googlebot ensures to crawl and read as many pages on a site as possible without disturbing its bandwidth. Although it usually visits a site once every couple of seconds, you can manage and put in a request to change the crawl rate for your site. Remember that Googlebot does not read all your text, images, videos, Javascript, and CSS together. The bot analyzes everything other than the text separately. This is why the 15MB limit is also applicable to the text volume only. 

How To Block Googlebot From Crawling your Site

A developer understands how important it is to keep updating their site, and thereby, the site goes under construction. This is the worst time to let Googlebot visit your site. This is why it is necessary to make sure that you block the bot at times like these. But does Google allows it? Well, yes, it does. In fact, it offers three options for users to choose from as per their needs. The options are: restricting the bot from crawling a page, indexing a page, and accessing the page at all.
This also helps in preventing the bot from reading any incorrect or broken links present on your site.

Summing It Up!

With the new addition to Googlebot documentation, there is no harm to your site. In fact, it never has been, as Google declared that it has only been crawling the first 15MB of HTML for years now. However, it is important for all the developers to ensure not to create unnecessary lengthy sites and present their information in a short and crisp way. If you are looking for someone who can do that for you, get in touch with UNIbizTec. Our digital marketers and developers will help you in creating a site that can take your business to newer and greater heights.