Broken link building isn’t a new concept. The basic idea is to find previously-published content on the web that has been taken down or otherwise lost, and restore it in a new way. Broken link building campaigns can be even more successful when the newly resurrected content takes the conversation even further than its predecessor.
Before we get into it, consider the alternative: instead of recreating a piece of content, try writing your own. Broken link building is a tactic and your time might be better spent creating something worth linking to, and promoting it.
Just any content won’t do; you’re looking for something that had a measure of popularity or authority, either with links or via social media. In terms of SEO, Google values links like votes. Sites with more votes tend to do better in search results, especially for competitive terms.
Content that gets scrapped may have been valuable enough at one time to acquire links from other websites. If a page is destroyed, those hyperlinks result in a dead-end. Not an ideal experience for the visitors of those linking websites. Restoring the content provides those webmasters with a new resource to share with their audiences.
Why do some pages disappear? Here are the most common reasons that pages get retired or breaks:
- Design changes: When a website gets an upgrade visually, URLs can get lost in the shuffle.
- CMS changes and migrations: Moving from one content management system to another can wreak havoc on a website’s architecture, especially when it comes to URLs.
- Change in services or corporate direction: Sometimes URLs will be purposefully retired because their content no longer matches the owner’s business goals.
The reason why a page has gone missing above should be obvious as you uncover broken links. The first two causes are generally not done intentionally so I would recommend reaching out to those sites about their broken pages. That message may mean a lot to the content’s owner and may prove to benefit you both in the end.
How do you find broken links?
First you, you need to know what a “broken” link is. In our case, we’re looking for URLs that return a 404 Page Not Found response. We understand web pages based on their design and content, but search engines and web browsers (like Chrome or Firefox) see them differently.
When we load a web page in our browser there is a lot going on behind the scenes as you “download” that page to your computer. Before any of that happens, though, the web page sends the browser a status code. 200 is the code for “everything is OK” while 404 means the URL or file couldn’t be found.
In addition to broken pages, we also want to find content related to our own. When starting a broken link building campaign we should have a search keyword in mind. There is a term, something that people are searching for that we want to rank for.
To find relevant, broken content that I want to recreate there are a few tools and methods I use.
Browser Extensions for Broken Link Building
It only makes sense that most of my tools would be right there in my browser. Here’s the list of Chrome extensions I use for projects like this.
This tool is commonly used for quickly detecting redirects and redirect chains, but it displays all HTTP headers, including 404s. It’s great for at-a-glance confirmation that a URL is broken.
Broken Link Checker checks the status of every link on a web page. Live links appear as green blocks while broken links appear as red. It also has the ability to check an entire website as part of a paid subscription. For this tutorial, the free version is fine.
I’ll discuss Archive.org next but this tool essentially captures 404 pages and immediately tries to find them on the Wayback Machine, an archive of the internet.
Intro to Archive.org’s Wayback Machine
In order to bring content back to life, you need to see it. Archive.org’s Wayback Machine, like Google, crawls the web and collects information. The Wayback Machine stores “snapshots” of web pages which allows you to see back in time, at least in terms of the web.
Amazon is a great example of how much Archive.org captures and how much Amazon has changed over the years. Visit archive.org and punch in your favorite website to get a timeline like the one shown below.
Click any of the years to “zoom in” on that year. Then select any day from that year with a blue circle to see what that page looked like on that date in history, according to the Wayback Machine.
Not only is the content still there, but many of the images are as well. Not totally relevant to what we’re trying to accomplish, but cool nonetheless.
But let’s get back on track now that we know what the Wayback Machine can do. Archive.org can show the Ghosts of Content Past, but we need to know where to look.
Broken Link Building Example
I won’t cover all of the steps here but this should help you get started. Since this blog covers all things digital marketing I’ll choose a related topic. Let’s say I’d like to get some links or votes for my post on technical SEO.
My first step in my broken link building exercise is to determine who already owns a search, like “what is technical SEO?”
Neil Patel’s QuickSprout post about technical SEO is the clear winner, with high rankings and a special placement in the SERP (search engine results page). Given Neil’s reputation as an authority in digital marketing, I was a bit surprised to see broken links on this page. However, QuickSprout as a publication publishes a huge amount of content, so I can see the occasional URL or link getting lost in the shuffle.
Using the Broken Link Checker extension for Chrome I see that there are more than 206 broken links on this page. But let’s call those what they really are: content opportunities!
There are two actions I need to take when discovering broken links on an authoritative website:
- Inform the site’s or content’s owner about the broken internal links. I’ll cover this step in another post.
- Investigate external broken links and attempt to recover their content.
Clicking on any of the broken URLs will launch the 404-me-not plugin, which searches both Google and the Wayback Machine for the URL. In this case, the external broken links belong to Google. And since mobile SEO is critical right now, I focus my attention on https://developers.google.com/search/mobile-sites/mobile-seo/configurations/dynamic-serving, which is returning a 404.
The Wayback Machine draws a blank for this URL, meaning it’s not archived. Google, however, found this content. The 404-me-not plugin generated this search for me:
("separate-urls" ) & (site:developers.google.com)
The valid URL that QuickSprout should be linking to is:
It looks like Google dropped the /configuration/ directory and didn’t create redirects. That’s OK, we can write a post about this later and give the big G a heads-up.
Our investigation has taken an interesting turn. Instead of recreating something that had been lost we can create something brand new to inform both Neil and Google about potential optimizations to their sites.
Had the content been truly gone and not just changed, we could have resurrected it and informed QuickSprout about the content’s new location. I’ll cover what that looks like in another post as well as how to actually start conversations with webmasters and bloggers.