This is the last of three articles addressing some of the privacy and information security risks that university and college websites present.
This post looks at a further source of privacy and information security risk: exposed email addresses.
We collected data to support our analysis by scanning the first 50 pages of 798 university and college websites (about 40,000 web pages) across six English-speaking countries: Australia, Canada, Ireland, New Zealand, the UK and the US.
We looked specifically for email addresses associated with a mailto: link – a link allowing a visitor to click and send an email from their local email program. The purpose of the scan was to gather evidence, from the field, about observable institutional practices on publishing personally identifiable information. Roughly 40% of the sites we examined published mailto: links.
Risk Management in Cinemascope
The first two articles support two important points that are reinforced by our review of email publication practices.
Universities and college websites present an almost unique risk management challenge. They comprise collections (“estates”) of sites, some under central control, some subject to central influence, others open to persuasion and others entirely independent. That set of relationships presents a tricky risk management or mitigation problem.
The loose affiliation of sites to a central management function, makes having current site information imperative. In practice, it is easy to lack basic information about individual sites and whether they meet institutional standards or policies. If you don’t check what cookies are actually being loaded, you can’t make clear statements about how well policy is being followed.
And, as the current post demonstrates, if you don’t examine which email addresses are visible on your website you don’t know your risk exposure or even if the email addresses are “correct”.
Email Address Philosophy/Approaches
When we looked at higher education website privacy and information disclosure statements, we noticed that the documents cover website data collection and data use. Few of the statements we found discussed publishing personally identifiable information, such as email addresses or telephone numbers.
We can appreciate the attendant risks and form policy approaches by understanding what is happening on “actual” websites. In practice, we see five approaches:
The first approach funnels most email communication through web forms, of varying levels of sophistication. Using a combination of drop down menus options, visitors construct pre-formatted or free-form emails that are directed to the relevant individual. And, this can be accomplished without the need to expose underlying staff or faculty email addresses.
Using web forms comes with a couple of obvious disadvantages. Many visitors do not take the time to complete web forms, as completing forms is inconsistent with email or social media posting practices. And, even if a visitor completes the form, there is no record in the visitor’s email software of the opening email conversations, potentially complicating response tracking.
A second approach eschews web forms, but restricts visible email addresses to generic email@example.com or firstname.lastname@example.org style addresses that are routed to shared inboxes, where they can receive attention. In fact, for the 312 sites that yielded mailto: links the following are the most popular generic email categories:
We’ll confess to conducting a couple of customer service tests on generic email addresses. We’ve sent tracked emails to generic webmaster and data protection department emails. Most organisations do not send automatic confirmations to ensure the sender knows the email has been received and the associated time frame within which a response should be expected. And, we’ve observed the subsequent email open rates, as best we can. Open rates fall very far short of 100%. This highlights a customer service issue – perhaps another type of organisational risk?
The third approach is to publish staff and faculty email addresses, as needed, across websites, typically on profile pages or “about us” sections. For the 798 websites we checked, 312 of them (39.1%) published one or more email addresses (recall, we were looking for mailto: links). The distribution of the number of mailto: links we found on the first 50 pages of the 312 sites are summarised in the following histogram:
Graph 2: Shows the distribution of the number mailto: links found on the first 50 pages of the 312 websites in our survey with mailto: links present. Approximately, 40% of the sites had five or fewer mailto: links. At the other extreme, one site had over 300 mailto: links spread over 50 pages.
About half of the sites inspected had 10 or fewer email links per 50 web pages, while a small number included much larger numbers of clickable email links. On average, we found 19 mailto: links per 50 pages or about 1 per every 3 pages.
There are a few issues with adopting this laissez-faire approach:
- We consider the mailto: links to be exposed email addresses, as email address harvesting programs will readily find these (as we did) and add them to lists for subsequent contact/spamming.
- The email addresses may have format errors – typos, contain spaces or other characters that break the mailto: (we found these errors, as well).
- The email address may not conform with institutional policies to use ‘work’ addresses rather than personal addresses for website contacts. About 5.5% of the addresses we inspected used gmail.com, hotmail.com or yahoo.com as the contact email.
- The email address may be out of date or a specific individual may no longer be on-staff and the address needs updating.
We’ve observed at a number of institutions, where the senior management staff provide contact emails the mailto: link actually directs to an assistant, suggesting an awareness of the spam problem.
- email addresses may not follow institutional policies to use ‘work’ email addresses rather than personal addresses for website contacts.
- email addresses may be out of date, an individual may no longer be on-staff and an address needs modifying.
The fifth, and in our view, the easiest to manage option is to a central staff and faculty directory, searchable via on-page queries. Database-driven directories are less readily harvested by website crawler programs, are subject to central management and update by the listed individuals and departments.
From looking at hundreds of university and college websites, we would only observe that be sure to guide visitors to a directory and to permit searches that allow departments to be located as well as individuals. After all, the objective is to facilitate not hinder communication.
For each of the approaches listed above we assume that spam filtering and anti-virus scanning is already in place to weed out most of the problematic emails being received. And, we not that we have seen hybrid approaches: typically, a searchable directory with other email addresses scattered across a site as ‘needed’.
There are a five main approaches to ensuring site visitors, at higher education institutions, can reach individuals by email. A central, searchable contact database provides a manageable means of ensuring visitors can find staff and faculty - so long as the directory is easy to find.
On the other hand, providing on-page email addresses avoids the need to re-direct visitors to other sections of a website. But, decentralising email address access means checking and updating email addresses scattered across hundreds or thousands of pages. And, given the fragmented way in which university and college websites have evolved, the latter situation likely prevails at your higher education institution and presents an interesting opportunity to review the current approach.