- Tiny Lesson
As a consultancy, we’ve worked with many organisations that have a sprawling and diverse digital presence. It’s often hard to keep track of the sites that make up your web estate, with new sites popping up all the time. The scale of an estate often means that management is sparse, with many sites staying live longer than they should. Legacy sites may be vulnerable to cyber attacks that could compromise trust in your brand. How can you start getting your estate under control?
The challenge is often knowing where to start. The following are some practical tips to start building an inventory of your web estate. This is essential to understanding the size of your estate, and deciding what stays, what goes, and what needs updating.
Involve subject matter experts
It’s important in a project like this to involve the right stakeholders from the start. Your web estate almost certainly spans multiple teams, each with their own priorities, systems, and partial visibility of the estate. Involve departments like IT, procurement, digital and cyber security. These are also the teams that will be involved in future management of a web estate inventory, so it pays to involve them in the process early on (rather than surprise them with it later).
These teams may be managing sub-sections of the estate themselves already. These lists and inventories are invaluable when starting to create one source of truth. They may include data that you won’t find in a site scrape. For example, the intended purpose of, and audience for, content, and contact details for site owners.
Use your analytics tools
One way to start uncovering the true scale of your web estate is by using analytics platforms like Google Analytics. Within your GA property, you can navigate to reports that include the ‘Hostname’ dimension (or build a custom exploration in GA4) to see every domain and subdomain where your tracking code is firing. This effectively gives you an inventory of sites that are sending data to your analytics account. This can reveal forgotten microsites, campaign domains, or staging environments still in use.
Use tools to discover subdomains
Another useful technique is to use external discovery tools to find subdomains associated with your primary domains. Tools like these run a subdomain scan to find both known and previously unseen subdomains. This can add to the lists you’ve already discovered, including legacy applications or regional sites that may not be tracked via your analytics environments. This approach can help to generate a more complete picture of your estate, bringing unmanaged properties back under control. We've found success using the subdomain finder offered by PenTest, a free account allows you to find up to 1,000 subdomains.
Add metadata to your inventory
Once you have your list of URLs, you will need additional data points for each in order to start making decisions about what can be retired and what should stay part of your estate. We recommend using a site crawling tool to return metadata for your list of URLs. A tool like Screaming Frog's SEO Spider or SiteBulb's website crawler will return data like status and indexability of each site.
De-duplicate your list
Ahead of further analysis, removing duplicates from your inventory is essential. You'll be using a variety of sources, so it's very likely that some URLs will appear more than once. Removing duplication provides a clearer picture of your estate. Google sheets and Excel both have built in 'remove duplicates' functionality that make this easy. Including a column for data source (i.e. web crawl, Google Analytics) helps to keep track of your data.
Creating an inventory of your estate takes work and investigative skills, but it’s worth it. It can help to reveal opportunities for your organisation. Save costs. Reduce risk. Improve your carbon footprint.
Related thinking
- Tiny Lesson
Tiny Lesson: A simple technique for defining boundaries
- Tiny Lesson