How to configure webpages so Seach Engine Optimisation (SEO) can work
Why does configuring webpages so SEO can work matter to online course and programme providers? Because, before you can optimise your site for search engine performance, you need the foundation in place for SEO to work effectively or for SEO to not be working against other website settings.
As we continue to build our datasets about online education providers and their websites, we notice a surprising number of ‘basic’ website set-up issues and configuration settings that no longer follow best practice. These blog posts are designed to educate about the issues and how they can be resolved.
We will cover mobile configuration next week, social media set-up the week after and conclude with a discussion of website analytics configuration.
In this post, we will address basic ‘pre-SEO’ configuration that should operate site-wide as well as some settings for specific pages. Again, after reading this post you should be aware of the key set-up decisions and their influencing factors. You will then be positioned for a more informed discussion with your team about how your online education or e-learning site should be configured.
Let's split the process into two:
- Influencing factors – policy or ‘philosophy’ decisions you need to make about set-up along with some best practices;
- On-page configuration – more best practices, but tuned to specific pages on your website.
Let your visitors enjoy their visit. Ensuring site visitors enjoy their visit by making pages load quickly. Compression and caching are two system-wide settings that can be used to minimise image download and display (“rendering”) time to provide a better user experience. Moreover, search engines think fast is good, as well. When they index a site search engines will assess whether compression is enabled and adjust a ranking accordingly. The essential tool for enabling server-based compression is GZIP. We find a surprising number of online education websites do not have GZIP activated: turn on GZIP to keep users and search engines happy. You can verify your current settings by using CheckGZIPCompression.
Help the search engine crawler get the best view of your site. An XML site map assists Google and Bing in understanding the structure and size of your website so it can be efficiently indexed. You can generate an XML sitemap using code supplied by Google. You can check whether you have an XML sitemap already in place, here: Webmaster World Tools. Equally as important, but far less commonly seen, is an HTML site map for human visitors: a page on your site that shows the overall structure. While a search leads to the pages you want visitors to land on, a map provides an overview of how the site is structured. You can try the following tools to produce HTML sitemaps with content management systems: Drupal, WordPress, TYPO3 and Joomla!.
Further assistance for search engines. As well as having an XML site map, you should have an appropriately set-up Robots.txt file. Again, configuration should be left to the technical specialist, but understanding the operational implications will help in ensuring better SEO performance. The robots.txt file tells Google, Bing and other search engine ‘indexers’ how to handle the indexing of your site. You can set entries in the file to ask search engines to ignore certain parts of your website, but this can lead to tantalising holes in the search engines' results.
It is better practice to allow the search engines to crawl but to use HTML meta “noindex” elements (or tags, as they are often called – albeit, strictly, incorrectly) to stop specific pages being listed for search results. We conducted a small survey of 224 online course provider websites in preparation for this blog and found that only 20% had explicit instructions on the homepage for robots. You can use the Google or Bing Webmaster tools to test your Robots.txt file.
More help for visitors and search engines. Search engines and human visitors looking at search engine results appreciate easy-to-read URLs. By default, many content management systems display arcane looking URLs based on database entries and the plugin is being used. These URLs are hard to read and understand and meet with a poor user response. Turn on the plugin, module or core system setting that enables ‘search engine friendly’ URLs. However, exercise some care in thinking about the structure of the site and the naming conventions used. A well-conceived structure will results in concise and readable URLs and click-throughs from search results will be better. Search engines assess the structure of URLs in on-page links, so avoid the underscore (“_”), substitute the dash “-” or use capitalisation and receive higher ranking as a reward.
Small, smaller, smallest. We discussed compression and caching earlier, so this is a good point at which to discuss another technique for improving a visitor’s experience when arriving at your site. As pages load they bring with them the associated code that controls the presentation and on-screen interaction. The larger and more numerous these file the slower the page loads. One method of resolving this problem is ‘minification’ – essentially removing and compacting everything that isn’t actual program code. This can reduce the amount of data being loaded by 50% or more. This is a well-established technique and tools are available from Yahoo, YUI compressor or Google's Closure Compiler or UglifyJS (warning, very technical). As well as ‘squeezing out all the air’, multiple files can be consolidated to reduce requests to load files and site visitors benefit from faster loading and site operators benefit from better search engine ‘scores’.
The big squeeze. After switching on compression and minifying code we can conclude our work to load pages faster by compressing in-page images. Fetching large numbers of uncompressed images slows down page loading. But, before we apply compression, we need to ensure that we set-up an image’s properties correctly for its intended use:
- Choose the image dimensions according to its use – there’s no need to use a 400x400 image if a 100x100 is all that is needed;
- Set the appropriate resolution (dots/pixels per inch). The source image may be 300 dots per inch, but this can be downgraded to 72 dpi on your website;
- Decide on the image format. Animated images (e.g. ‘spinners’ showing processing is taking place) should use the GIF format. Images that need to be transparent with the background should use the .PNG format and just about everything else can use the .JPG format.
When you’ve adjusted your image to the right size, resolution and format apply compression. You will have to experiment with the amount of compression. We achieve acceptable final image quality with 60%-65% compression. However, you don’t want to degrade the quality of the resulting image, hence the need for experimentation. One service we like for compression – and it’s free – is, JPEG-OPTIMIZER.
For some more background on this topic, read the linked article on the Google Developers website. Also, don't forget your site’s “favicon”, which may need to be available in multiple formats to respond to different devices – we’ll address this in our future post on mobile readiness.
The art of re-direction. One decision that you have likely already made is whether visitors find you at www.example.com or as example.com. Search engines consider these to be two different sites. As a result, you need to indicate that traffic to one site should be redirected to the other, ideally using a so-called “301” redirect, which indicates the redirection is permanent. Using the same method, you can also redirect from other domains (e.g. .net or .com.au) to your principal site. Redirection configuration is generally implemented on your web server, where other changes to URLs (such as the search engine friendly URLs discussed above) are taking place. You can read more about the detail here: SEO Book.com.
Once you have made decisions about what we have termed as influencing factors you can turn your attention to some page specific configuration issues. The first of these is something that re-imposes structure on HTML pages, so-called structured mark up.
Structured mark-up is used to highlight key elements of webpages so search engines and to some degree social media sites, can better understand the content of a page. We will discuss the social media aspects of on-page mark-up when we discuss social media. However, the main search engine companies have agreed the use of standard in-page data mark-up, as defined by schema.org. Schema.org structure allows specific data elements (such as addresses, the name of a university, an event, etc.) to be placed on a webpage according to the marketing and layout preferences of a site owner, but marked up in such a way that a search engines will reliably find the data and display it correctly formatted in a search result listing. See our recent blog post for further explanation and this Moz.com reference for more background.
Managing duplicate content. For a number of reasons, you will sometimes have duplicate content on your site and this is a potential source of confusion. You need to tell search engines the content you want indexed and returned in search engine results and what you wanted ignored. You can do this with the page header link element rel=”canonical”. This setting signals to search engines that the current page (the one with rel=’canonical’) is the one to display in search results.
If your site has pages accessed with a combination of http:// and https://, then you need a slightly different approach. You should specify the URLs in the rel=”canonical” element as relative. That is drop the http://www.example.com or https://www.example.com – and include everything that comes after prefaced by a ‘/’. If you change your mix of securely accessed pages, you will not need to amend each page affected by the change. As ever, the incredibly helpful Moz.com has more reading if this topic has caught your attention.
Handling unusual characters. This section is technical, which we did say some parts of this series would be, but does have a bearing on internationalisation. Currently, there is no single standard for the character sets used on webpages, which is why you occasionally run into problems viewing accented or other ‘non-standard’ characters. Most browsers assume a site is using something called ISO-8859-1. This is a more limited set of character representations that the more modern standard known as UTF-8. We ran a small experiment on 224 online course provider websites (all the big MOOCs as well as a couple of hundred other well known providers) to see what they specified. Almost 85% specified UTF-8, 5% still specify ISO-8859-1 and 10% provide no specification at all. As a ‘by-the-way’, the World Wide Web Consortium uses UTF-8. We suggest for most Roman alphabet-based sites you set the character set to UTF-8. However, you can allocate a character set per page, which allows you to properly support other language representations on your website.