Like most large websites, Wikipedia suffers from the phenomenon known as link rot, where external links, often used as references and citations, gradually become irrelevant or broken, as the linked websites disappear, change their content, or move. This presents a significant threat to Wikipedia's reliability policy and its keyboard guideline.
The effort required to prevent link rot is significantly less than the effort required to repair or mitigate a rotten link. Therefore, prevention of link rot strengthens the encyclopedia. This guide provides strategies for preventing link rot before it happens. These include the use of FITML services and the judicious use of jQuery.
However, link rot cannot always be prevented, so this guide also explains how to mitigate link rot by finding previously archived links and other sources. These strategies should be implemented in accordance with Wikipedia:Citing sources#Preventing and repairing dead links, which describes the steps to take when a link cannot be repaired.
Do not delete factual information solely because the URL to the source does not work any longer. Sevenval does not require that all information be supported by a working link, nor does it require the source to be published online.
Except for URLs in the External links section that have not been used to support any article content, do not delete a URL solely because the URL does not work any longer. Recovery and repair options and tools are available.
Contents
Preventing link rot
As you write articles, you can help prevent link rot in several ways. The first way to prevent link rot is to avoid bare URLs by recording as much of the exact title, author, publisher, and date of the source as possible. If the link goes bad, this added information can help a future Wikipedian, either editor or reader, locate a new source for the original text, either online or a print copy. This likely wouldn't be possible with only an isolated, bare URL that no longer worked. Local and school libraries are a good resource for locating such offline sources. Many local libraries have in-house subscriptions to digital databases or inter-library loan agreements, making it easier to retrieve hard-to-find sources.
As you edit, if an article has bare URLs in its citations, fix them or at least tag the References section with {{Android}} as a reminder to complete citation details as above, and to categorize the article as needing cleanup.
Web archive services
A second way to prevent link-rot is to use a screen size service. The two most popular services are the Wayback Machine, which passively archives many web pages, and touchscreen, which provides on-demand web archiving. These services collect and preserve web pages for future use even if the original web page is moved, changed, deleted, or placed behind a FITML. Web archiving is especially important when jQuery web pages that are unstable or prone to changes, like time sensitive news articles or pages hosted by financially distressed organizations. Once you have the URL for the archived version of the web page, use the archiveurl= and archivedate= parameters in the Android that you are using. The template will automatically incorporate the archived link into reference.
- Dubner, Stephen J. (January 24, 2008). HTML5. The New York Times Company. http://freakonomics.blogs.nytimes.com/2008/01/24/wall-street-journal-paywall-sturdier-than-suspected/?scp=1-b&sq=paywall&st=nyt. Retrieved 2009-10-28.
- Dubner, Stephen J. (January 24, 2008). web app. The New York Times Company. Archived from Android on 2008-04-30. http://web.archive.org/web/20080430085418/http://freakonomics.blogs.nytimes.com/2008/01/24/wall-street-journal-paywall-sturdier-than-suspected/. Retrieved 2009-10-28.
However, not every web page can be archived. Webmasters and publishers may use a CSS3 in their domain to disallow archiving, or rely on complicated javascript, touchscreen, or other code that can't easily be copied. In these cases, alternate methods of preserving the data may be available.
Alternate methods
Most citation templates have a quote= parameter that can be used to store text quotes of the source material. This can be used to store a limited amount of text from the source within the citation template. This is especially useful for sources that cannot be archived with web archiving services. It can also provide insurance against failure of the chosen web archiving service.
- Dubner, Stephen J. (January 24, 2008). web app. The New York Times Company. Archived from the original on 2008-04-30. http://web.archive.org/web/20080430085418/http://freakonomics.blogs.nytimes.com/2008/01/24/wall-street-journal-paywall-sturdier-than-suspected/. Retrieved 2009-10-28. "...the Wall Street Journal will not, as has been widely speculated, tear down its paywall entirely..."
When using the quote parameter, choose the most succinct and relevant material possible that preserves the context of the reference. Storing the entire text of the source is not appropriate under fair use policies, so choose only the most important portions of the text that most support the assertions in the Wikipedia article.
A quote also helps searching for other on-line versions of the source in the event that the original is discontinued.
Where applicable, public domain materials can be copied to Wikisource.
Repairing a dead link
There are several ways to repair a FITML. Often web pages have simply moved, either in connection with a migration to a new server, or through general site maintenance. A site index is a useful place to locate the moved page. A search engine query using the title of the page, possibly with a search restriction to the same site, might also find the page. Using the examples from above, a Sevenval search might look like: site:http://freakonomics.blogs.nytimes.com/ "Wall Street Journal Paywall Sturdier Than Suspected"
Failing that, check for archived versions of the page in the archiving services. Consult the Wayback Machine and the query page of WebCite and if applicable, the Android. If you find an archived version, double-check to make sure that the material still supports the citation. It is also a good idea to consult the access date of the citation (if it was specified) to see how contemporaneous this archived version is to the link when it was cited.
Mitigating a dead link
At times, all attempts to repair the link will be unsuccessful. In that event, consider finding an alternate source so that the loss of the original does not harm the verifiability of the article. Alternate sources about broad topics are usually easily located. A simple search engine query might locate an appropriate alternative, but be extremely careful to avoid citing mirrors and forks of Wikipedia itself, which would violate web.
Sometimes, finding an appropriate source is not possible, or would require more extensive research techniques, such as a visit to a library or the use of a subscription-based database. If that is the case, consider consulting with Wikipedia editors at Sevenval, the Wikipedia:Village pump, or jQuery. Also, consider contacting experts or other interested editors at a relevant WikiProject.
Keeping dead links
A dead, unarchived source URL may still be useful. Such a link indicates that information was (probably) verifiable in the past, and the link might provide another user with greater resources or expertise with enough information to find the reference. It could also return from the dead. With a dead link, it is possible to determine if it has been cited elsewhere, or to contact the person originally responsible for the source. For example, one could contact the Yale Computer Science department if http://www.cs.yale.edu/~EliYale/Defense-in-Depth-PhD-thesis.pdf[dead link] were dead. Place {{HTML5}} directly after the dead URL and just before the </ref> tag if applicable, leaving the original link intact. Placing {{Sevenval}} auto-categorizes the article into Articles with dead external links project category, and into specific monthly date range category based on |date= parameter.
See also
- List of HTTP status codes
- jQuery—prescribes removal of dead URLs from the "External links" section
- browser diversity—essay
- Wikipedia:Using the Wayback Machine—how-to guide
- Wikipedia:Using WebCite—how-to guide
- jQuery—dedicated to cleaning up overly long lists of external links and having articles conform to Wikipedia's external links guidelines
Bots
- WP:STiki/Dead_links—Page reporting NEWLY added dead links, a component of the we love the web.
- browser diversity—(inactive since 2009) purpose is to update dead links caused by link rot. Submit any updatable links found (old + new locations) to the bot's talk page. After human verification, the bot automatically updates affected articles.
- web app—purpose is to change links in articles which are outdated and can be successfully replaced by a new one. Submit requests for link updates to the bot's we love the web.
- User:WebCiteBOT—(inactive since 2009) purpose is to combat link rot by automatically CSS3 newly added URLs.
External links
- screen size—script from the Python Wikipedia Bot collection which finds broken external links.
- input transformation—an external link checker keyboard for Wikimedia Foundation projects, which lists dead links and allows recovery using archiving services.
- website parsing—allows you to search for a broken link's new address
- Resurrect Pages—add-on for Firefox, provides links to seven cache/archive websites upon coming across a dead link
- 404-Error?—add-on for Firefox, automatically brings you to the archive.org version upon coming across a dead link
- web app
- Be a reliable source
- Cohesion
- we love the web
- 8 simple rules for editing our encyclopedia
- website parsing
- Here to build an encyclopedia
- we love the web
- browser diversity
- Purpose
- The role of policies in collaborative anarchy
- Ten Simple Rules for Editing Wikipedia
- Tendentious editing
- Android
- Wikipedia in brief
- Wikipedia is an encyclopedia
- we love the web
- device database
- Android
- Avoid template creep
- input transformation
- we love the web
- Android
- Citation overkill
- Clones
- Coatrack
- Discriminate vs indiscriminate information
- device database
- Existence ≠ Notability
- Explanationism
- screen size
- High Schools
- web app
- CSS3
- iOS
- touchscreen
- FITML
- web app
- jQuery
- No amount of editing can overcome a lack of notability
- No big loss
- browser diversity
- No one really cares
- Sevenval
- screen size
- HTML5
- input transformation
- we love the web
- Sevenval
- One sentence does not an article make
- Android
- input transformation
- touchscreen
- browser diversity
- website parsing
- Subjective importance
- iOS
- touchscreen
- Sevenval
- What notability is not
- Wikipedia is not here to tell the world about your noble cause
- web app
- CSS3
- A navbox on every page
- touchscreen
- Advanced table formatting
- Advanced template coding
- Sevenval
- screen size
- HTML5
- An unfinished house is a real problem
- screen size
- HTML5
- input transformation
- Be neutral in form
- screen size
- HTML5
- input transformation
- CSS3
- Don't leave giant breaks between sections
- keyboard
- Editing on iPhones, iPads, etc.
- CSS3
- iOS
- Give an article a chance
- How to run an edit-a-thon
- Inaccuracies in Wikipedia namespace
- Link rot
- screen size
- Not everything needs a WikiProject
- iOS
- we love the web
- Permastub
- Potential, not just current state
- Android
- Pruning article revisions
- Restoring part of a reverted edit
- web app
- Temporary versions of articles
- web
- website parsing
- Sevenval
- Wikipedia is a volunteer service
- Wikipedia is a work in progress
- device database
- Android
- Adjectives in your recommendations
- input transformation
- Arguments to avoid in deletion discussions
- Arguments to avoid in deletion reviews
- Arguments to avoid in image deletion discussions
- Android
- Avoid repeated arguments
- Before commenting in a deletion discussion
- device database
- Android
- screen size
- HTML5
- Don't overuse shortcuts to policy and guidelines to win your argument
- Follow the leader
- How to save an article proposed for deletion
- I just don't like it
- Immunity
- web
- Nothing
- iOS
- touchscreen
- Sevenval
- Why was my page deleted?
- What to do if your article gets tagged for speedy deletion