Making A Mobile Friendly Website: Resources Blocked By robots.txt (5/5)

mobile friendly resources blocked by robots.txt

Fixing “Resources Blocked By robots.txt” Failures

Our earlier posts address preparing your website for mobile friendliness.  We reviewed the background to mobile friendliness, the Google mobile-friendly test tool and set out solutions to four of the five problem categories highlighted by Google’s test tool.

We now turn our attention to the robots.txt-blocking problem, which is easy to understand, but potentially very hard to solve. 

Resources Blocked By robots.txt

Robots.txt refers to an optional file that can be placed on a website to “help” the site owner control what search engines index. The file simply specifies a set of files or folders (directories) that should not be indexed. We say should because search engines may choose to ignore the robots.txt instructions as they index.

The advantage of the robots.txt file is that, if respected, a site owner can avoid engines indexing files that only exist to make the site work, as opposed to files that contain content.  This situation is particularly relevant for sites using Content Management Systems or other systems that generate content dynamically.  These sites use tens, hundreds or even thousands of individual script files that contain no content, but assemble content and apply formatting and layout when a page is accessed.

The main disadvantages of robots.txt are that an independent standards authority does not define its function, adherence to its operation is voluntary, the control options within the file are limited and the file can reveal details of the internal workings of a site to those with malintent.

As you might expect, the Googlebot respects robots.txt.  Annoyingly, the bot also respects the robots.txt when running the mobile-friendly test.  As a result, sites using default robots.txt values block the bot from accessing display files: including JavaScript, CSS and other files. Fully responsive designs rely on CSS (and JavaScript, too), which means that the mobile-friendly test bot will not display a page as intended, likely resulting in a fail.

The solution appears to be simple – alter the robots.txt to allow access to these files.  And, therein lies the issue - “allow” is not a generally agreed valid term for robots.txt: even if Google does accept and act upon it. Common practice is to list only the folders not to be indexed, rather than individual don't-index-me files. An alternative solution might be to create a list of all the files not to be indexed and list these in robots.txt instead.  Now, you are telling those with malicious intent exactly what’s on your site as well as creating a huge robots.txt file that will be difficult to maintain.   By the way, Google's own robots.txt is currently about 8k and has been used successfully in the past by outside researchers looking for Google's “secret” projects that it has not "announce to the public".

An alternative approach is to relocate the blocked files into one common folder (or folders within a common folder structure) and then simply remove this folder from the robots.txt.  Unfortunately, if you use third-party software to run your site, you will find that JavaScript and CSS files used by that software reside within the third-party software’s folder structure and the software expects it to be there.  Moving these files will require amending the software to let it know the new location.  Even if this is possible, it is risky (as is any change to software), may incur time and effort and will probably make future software updating harder.

Despite the complexity of relocation, it is the solution we recommend as it results in a smaller, more manageable robots.txt file than other solutions and it should allow you to manage caching more easily as cache-suitable files are located in one place: although, the topic of caching is beyond the scope of this post.

To read more about using robots.txt for higher education websites see our post:  robots.txt - Help HigherEd Website Visitors Find The Good Stuff



Sign Up for Email Delivery:

We collect the following solely to email you new blog posts.

* indicates required

MailChimp stores your details. We do not share data with third parties.



Don’t have accurate and current information on all the websites you own? Not able to monitor and check each website’s content quality and risk status? Let’s talk about how we can help.


Blog photo image: