Google Explains Robots.txt Best Practices

Google’s Developer Advocate Martin Splitt explains robots.txt best practices for SEO, advising on when to use noindex tags and robots.txt directives and one common mistake to avoid. 

Noindex tag in robots meta tags Vs. disallow command in robots.txt files

SEJ writer Matt G. Southern caught Martin’s Search Central Lightening talk on YouTube, where he helpfully explains the difference between the “disallow” command in robots.txt files and the “noindex” tag in robots meta tags.

Split confirms that both play essential roles in managing how search engine crawlers work with websites, saying they have alternate purposes and you should not replace one with the other. 


Noindex tag in robots meta tags recap

The term “noindex” is a robots meta tag directive that instructs search crawlers to omit a web page from their search engine results, ensuring the page isn’t accessible via search while allowing it to be accessed if you link it internally from another page on your site.

To implement it, you add the “noindex” instruction in your site’s HTML head section using either the X-Robots HTTP header or the robots meta tag.

When to use it:

Martin advises using “noindex” to allow search engines to read your web page’s content while stopping it from appearing in search results. Split says this will enable users to see a page you don’t want displayed in search results.  

Disallow robots.txt file recap

Adding “disallow” to your site’s robots.txt file denies search engines access to specific files, URLs, or sensitive website areas, resulting in search engines not crawling or indexing the content. 

When to use it:

Splitt advises using “disallow” to completely block search engines from crawling and indexing a page, saying it helps protect sensitive data information and not displaying non-relevant content like log-in or thank-you pages, which can help preserve your site’s crawl budget.  

Split says to avoid this mistake

Splitt recommends not using “disallow” in the robots.txt file and “noindex” directive in the page’s meta tag for the same page, saying it is a common mistake and can cause search engines to index the page but with limited information. 

Instead, Martin advises using the “noindex” directive and not the “disallow” directive on the page to stop it from appearing in search results. 

You can review your websites disallow and noindex directives using Google’s Search Console to ensure search engines interpret them correctly.

The takeaway 

Effectively using the noindex and disallow directives enables you to control how search engines interact with your site, making understanding them essential to your SEO efforts. 

If noindex and disallow directives are misused, the long-term effects can damage your site’s SEO and security. 

By applying them correctly, you can manage link equity, protect sensitive information, and optimize your website’s search engine results visibility.

Picture of Terry O'Toole

Terry O'Toole

Terry is a seasoned content marketing specialist with over six years of experience writing content that helps small businesses navigate where small businesses meet marketing - SEO, Social Media Marketing, etc. Terry has a proven track record of creating top-performing content in search results. When he is not writing content, Terry can be found on his boat in Italy or chilling in his villa in Spain.

Read by 10,000+ world-class SEOs, CEOs, Founders, & Marketers. Strategy breakdown: