Ways To Prevent Link Rot

As mentioned in other posts, there is a serious problem with link rot in legal sources online. Nearly half of the links in SCOTUS opinions no longer work, and these opinions shed light on a much larger issue. "According to the Chesapeake Digital Preservation Group, a collaborative archiving program, the average life span of a webpage is between 44 and 75 days."

The problem with link rot was brought to light when Jonathan Zittrain from Harvard studied the SCOTUS opinions mentioned above. As he and the other authors of the study noted, “[t]he link, a URL, points to a resource hosted by a third party,” the authors explained. “That resource will only survive so as long as the third party preserves it. And as websites evolve, not all third parties will have a sufficient interest in preserving the links that provide backwards compatibility to those who relied upon those links.”

After this study, the Harvard Innovation Lab created perma.cc to help scholars protect and preserve the information that they rely on

Another way to preserve material is to capture a webpage on the Internet Archive's Wayback Machine and save it as it appears now for a trusted citation into the future.

For content providers, the folks at Associations Now put together a link-fixing checklist. This list works to have content providers create reliable links that work well into the future:

1. Talk with your vendors and developers about permalink structure.

2. Pretty permalinks preferred. It’s 2015, and while gibberish-heavy URLs were never really in vogue, they look downright ancient compared to a well-considered URL structure. If you’re eyeing a restructure of your website’s links, creating a well-organized permalink structure that cites the basic subject matter and is self-explanatory is the way to go, if your content management platform allows you to do so.

3. Know your redirects. If you’re changing up your links, you need to ensure that your old links go to the right place. It can be a headache to redirect URLs, but it’s definitely possible. That said, the two primary URL redirect strategies have different purposes. The 301 redirect, which is generally added at the server level, tells the browser to redirect the page entirely, so it forwards to the latest version. This is most useful for users and is said to have no effect on search engine results. Canonical redirects, meanwhile, are generally added to individual pages and are essentially messages to search engines, telling crawlers that the primary version of the page is somewhere else—something that might come in handy in cases of duplicate or republished content.

4. Make your old content web-friendly.  If long-term preservation and readability is your goal, you should convert PDFs to HTML, which will future-proof them for mobile, watches, or whatever content-delivery innovation is coming next. This isn’t a hard-and-fast rule—old magazines, for example, may make more sense in PDF formats—but it should be a discussion point.

If content providers work to preserve links on the front end and scholars work to save content on the back end, it will eventually create a system where we run across many fewer dead links in our research.


Popular posts from this blog

For The Love Of Archives

Law Library Lessons in Vendor Relations from the UC/Elsevier Split

Library Catalogs & Discovery Layers