Web Estate Registry

Autonomous website development and content creation leads to sprawling web estates
Web estates dilute branding, messaging and content quality making higher education digital marketing less effective
A web estate registry records and monitors websites to support digital governance and boost marketing and communications effectiveness

Higher education institutions typically let websites develop organically, leading to web estates with hundreds or thousands of autonomous sites. Estates in which the precise number of sites is unknown, individual site ownership is unclear and for which there is no effective digital oversight.

Our service discovers every website logging key site data to create a comprehensive web estate registry. It then continuously monitors sites on a user-defined schedule.

And, the registry provides accurate and timely data for digital governance and oversight and to make digital marketing and communications more effective.

Data to let higher education institutions answer wider questions such as:


With the EU's General Data Protection Regulation coming into force, how many of our websites will be affected? How many of our sites and on which pages do we use forms to gather personal data?

Internal Audit

Our internal audit group needs a list of all our websites for a value-for-money study of hosting service usage. Do we have such a list? In fact, how many websites do we have? How many are hosted internally versus externally?


Our websites must be accessible. In implementing institution-wide accessibility, how many sites would be involved? How many content management systems would be affected? How many pages?

Security & Privacy

We want all of our institution's websites to offer secure HTTPS connections. How many of our sites does this apply to? What web servers do we currently use? Are our HTTPS sites using certificates from our preferred supplier?

Website Discovery

Website Discovery

Starting from an initial seed list of urls our scanning software intelligently follows every link to discover all the websites and microsites in an institution’s web estate.

Data Collection

As each site is identified the scanning process captures the following base data:

  • technologies - security measures implemented, web server configuration and set-up, content management system(s)
  • site configuration - cookies, policy and privacy links and page counts
  • content – content types used and metadata


Data Consolidaton

Data Consolidation

The Web Estate Registry holds:

  • a central, single-source-of-the-truth database of all of an institution's websites, content owners and critical site data
  • the key information to explore, identify and evaluate website enhancement and risk minimization opportunities


Ongoing Monitoring

User configurable scheduling of website and web estate scans to keep data fresh and provide out-of-scope alerts.

Ongoing Monitoring

Ongoing Web Estate Monitoring

Web Estate Registry with Multiple Registers

A registry of all the sites within a web estate can be arranged as a set of sub-registers: reflecting a university or college’s reporting needs.  The system reports the total number of sub-registers and the number of websites in each sub-register.

Sample Register and Websites

Each website entry records a site’s owner and contact details along with technical, social media, content, security and privacy data.

Registry and sub-register data can be filtered, search and queried to answer institution-wide, or site-specific, questions. For example, which content management systems do our sites use? Which sites still need to upgrade to HTTPS? What Facebook or Twitter accounts do our sites use?

Website Summary - More Detail Accessed via Top Right HTML/PDF Icons

Click on a website record to view summary website data. Click on the HTML/PDF icons for full technical, social media, content, security and privacy details.


The discovery phase systematically identifies all of a higher education institution's websites and microsites, including those not using the main domain name.

Where to Start?

Institutions often can poll website owners and using existing lists to seed a comprehensive automated discovery exercise.

As well as needing somewhere to start, discovery needs careful planning to determine:

  • Which IP address ranges are relevant?
  • Which domains should be examined?
  • Should the exercise apply to internal websites as well as public-facing ones?

When to Stop?

Discovery means systematically checking every page link to uncover other relevant sites.

Scanning and site identification is iterative, continuing until no new servers or sites are identified.

In practice, scans can often be limited to thousands of pages, selectively inspecting server time stamps to focus on recent content, thus shortening the discovery phase.

A scanning exercise delivers a candidate list of URLs that can be filtered to a list of sites for loading to a registry.

Data Collection

Once discovery is complete every site is re-visited to collect critical content, user experience, performance, accessibility, structure and technical data for ongoing monitoring.

Data Collection

Data collection acquires data about each website's underlying technology infrastructure, the website implementation and relevant page content.

Up-to-date page content and site configuration information lets you understand: 

  • Web page metadata – titles, descriptions and other elements
  • JavaScript used to provide analytics, advertising and other on-page functions
  • Cookies being used
  • Whether privacy, accessibility and other policy statements are present
  • Accessibility as compared with the WCAG 2.0
  • Total counts of scanned pages for each (recorded during the survey)

Allowing marketing, communications, content editors and developers to identify and respond to user experience and related concerns, as needed.

With the current and accurate data collected for each website technical staff will know:

And, be able to change, modify and update systems and servers as appropriate.


Data Consolidation

All data is held in a web estate registry database for ongoing issue analysis and governance reporting.


A web estate registry holds data about individual websites within an institution's set of websites. Sites can be grouped for reporting purposes and specific site 'owners' identified to resolve performance or content issues.

Each site's data is automatically updated on a user-defined schedule.

Analysis can be carried out across sites or within groups of sites to identify and resolve systemic issues, to identify common risk exposures, to support web governance initiatives and comply with marketing and communications branding and other policies.


The web estate registry database can be queried to:

  • generate status reports, for example all sites using a specific version of WordPress
  • produce risk exposure reports covering financial, legal and regulatory and security risks
  • highlight content quality, user experience or other issues affecting digital marketing and communications campaigns.


North America
+1 416 464 9771
+44 203 290 3575


North America
Toronto | Canada
Edinburgh | UK