Broken Link Checker for WordPress

Notice: This plugin has been transferred to ManageWP. I am no longer working on it. Please direct any feedback to the new developer. See the plugin homepage for more information.

Sometimes, links get broken. A page is deleted, a subdirectory forgotten, a site moved to a different domain. Most likely many of your blog posts contain links. It is almost inevitable that over time some of them will lead to a “404 Not Found” error page. Obviously you don’t want your readers to be annoyed by clicking a link that leads nowhere. You can check the links yourself but that might be quite a task if you have a lot of posts. You could use your webserver’s stats but that only works for local links.

So I’ve made a plugin for WordPress that will check your posts (and pages), looking for broken links, and let you know if any are found.

Features

  • Detects links that don’t work, missing images, deleted YouTube videos and other problems.
  • Periodically checks links in posts, pages, comments, custom fields and the blogroll.
  • New and modified entries are checked ASAP.
  • Notifies you on the Dashboard if any problems are found.
  • Lets you edit all instances of a specific link at once.
  • Gives you a list of all links ever posted on your site, with the ability to search and filter it.
  • Lets you apply custom CSS styles to broken and removed links.
  • Highly configurable.

The broken links show up in the Tools -> Broken Links tab along. If any invalid URLs are found a notification will also show up on the Dashboard widget. To save screen real-estate, the widget can be configured to stay closed most of the time and automatically expand when broken links are detected.

Download

broken-link-checker.zip (412 KB)

    Requirements

    • WordPress 3.0 or later
    • MySQL 4.1 or later

    The current version of this plugin is only compatible with WordPress 3.0 and up. If you have an older version of WP, try one of the older releases. Specifically, version 0.8.1 is the last one that’s still compatible with the WP 2.8 branch, and version 0.4.14 is the last one compatible with WP 2.1 – 2.6.x.

    Installation

    Install “Broken Link Checker” just like any other WordPress plugin :

    1. Download the .zip file (see below).
    2. Unzip.
    3. Upload the broken-link-checker folder to you /wp-content/plugins directory.
    4. Activate the plugin in the Plugins tab.
    Related posts :

    2,584 Responses to “Broken Link Checker for WordPress”

    1. White Shadow says:

      For referrer headers, it uses “home”.
      For resolving relative links, it uses either post permalinks (where applicable) or “siteurl”.

      I now realize this behaviour is inconsistent. For the next version, I’ll change it to use the “home” URL whenever a better option – such as a permalink – is not available.

    2. Odys says:

      Haven’t looked at plugin’s code yet. Does it run cron jobs to check for broken links?

      If so, I think I have a solution for this odd behavior.
      – Somehow, the cron run from a different domain. Have to ask my host about it.
      – It tried to open an image but if failed. Thanks because I dissallow this from my .htaccess. So, the “you are not allowd to hotlink my images” image loaded and it thought that images redirect.
      – For an unknown reason it tried to fix redirects.

      There are many “ifs”, though. A better explanation?

      Also, does it check images together with links?

    3. Lee Winter says:

      Some weirdness on repair of links to fragments. Many such repairs have altered the title property of the link by removing the ‘title=’ part of the property, but leaving the ‘”title text goes here”‘.

      Example:
      Before:
      anchor text
      After:
      anchor text

      I did not notice the changes immediately because browsers tend not to complain. But html validators really object to attribute-less values.

      This may not be a BLC issue, but as far as I can tell BLC is the prime suspect.

      I will do some testing to narrow down a test case.

    4. White Shadow says:

      @ Odys:

      By default, it runs a cron job and uses periodic AJAX polling in the Dashboard to trigger the worker routine. You can enable/disable either in the “Advanced” tab in settings.

      I don’t know why it would spontaneously try to fix redirects, but that would certainly explain why the images were changed.

      And yes, it checks images together with links. They’re all just URLs as far as the checking routine is concerned.

    5. Odys says:

      Thanks White,
      For the time being, I’ve disabled image checking.
      If you find sth please let us know.

    6. White Shadow says:

      @ Lee Winter:

      I tried a few tests, but didn’t notice any problems – the “title” attribute was preserved. Waiting for your test case.

    7. PJ Tharsaile says:

      I’ve noticed that most of my borked links are YouTube links. I’ve managed to fix most of them. Most seem to be the result of deleted accounts (a few are as result of material being removed). In the event of someone updating a link it would be COOL to get a suggested alternative. It would entail keeping a record of substitutions, I know. But this is a feature I’d pay for if necessary.

    8. Odys says:

      Contacted my host and found that my assumptions were correct!
      CPANEL runs cron jobs on the main domain, addon domains cannot run their crons on themself.

      We DO know now that,
      – if somebody has CPANEL to manage their hosting service (most likely),
      – and if they have a blog on an addon domain (most likely),
      – and if they do not allow image hotlinking (most likely)

      they might see their post stracture ruined.

    9. Lee Winter says:

      I’m having some trouble with my test case. The site is about 100 static pages and one test post. Prior to the test there were 702 unique links out of ~1550 total. BLC was complaining about two broken internal links (missing pages)and six external redirections (all 302s, so not something I intended to fix).

      The test was designed as follows; The “people” page defined about 20 URI fragments, one for each person referenced on the site. I broke all of those links by changing the fragment ids. But the test is not effective because I cannot get BLC to reprocess the site. I cleared the cache, hit BLC/Advanced/Nuclear-option, but saw no activity. So I deactivated the plugin, poked around a bit, and re-activated it. No action. So I deleted the BLC plugin, ran W3 link checker to verify the existence of a bunch of broken fragment links, and reinstalled the BLC plugin.

      The configuration is set to 1800 second batch duration, and ti run both hourly and as long as admin is open. The system load average has ranged from 0.73 to 1.66 with the limit set at 2.5. So it should be running full tilt.

      But there is no indicator of activity or progress. *It might be useful to fix that gap*.

      Note that W3 link checker handles the entire site in about 6 min using 1 sec gaps between page fetches. So it probably should not take hours to complete a pass to just to find the links.

    10. Matthias says:

      Suppurt BLC the new Youtube/ Vimeo embed code style () ???

    11. […] Broken Link Checker – Ett litet plugin som kontrollerar om det finns några brutna länkar. […]

    12. Hi Janis,
      got a real subtle bug with the code. On my site I’ve added a few relative links

      e.g. href=”/about/”

      I’m also running my server on an alternative port (it a virtual server for testing).

      http://lampvm.monster:8080

      Link checker is testing

      http://lampvm.monster/about/

      when it should be testing

      http://lampvm.monster:8080/about.

      so I’m getting a few false positives.

      Otherwise it’s a great plugin, thanks

    13. Lee Winter says:

      Apparetly this did not get through to you from the w-shadow email:
      ——————————–
      As previously described my test case is pretty simple. But I made the
      mistake of upgrading to 1.2.4 before running the tests. The results
      are disgusting.

      BLC is finding a few pages that I have left seriously broken — either
      missing or timeout.

      The test did not detect _any_ broken links to page fragments. I
      re-ran it several times, letting it simmer overnight twice. I’ve
      cleared the database lots of times, deactivated and reactivated
      several times, and reinstalled 1.2.4 once. No change.

      I have been comparing BLC results against
      http://validator.w3.org/checklink/, which does see the broken
      fragments.

      So now I have given up on 1.2.4 and am reinstalling 1.2.3.

      Do you have a test suite of broken links that you run new versions
      over in order to verify that maintenance/enhancements have not
      introduced undesirable behavioral changes? I would be willing to help
      set that up if you are able to automate the application of the BLC to
      the test pages and also automate the comparison of results. The test
      harness would need a batch-mode framework rather than the plugin’s
      continuous framework.

      The site in question is tactile-revolution.org. The broken links are
      in /ask/when/history/. They should be pointing to parts of
      /ask/what/people/.

      I’ll send more info when I have it.

      Lee Winter
      NP Engineering
      Nashua, New Hampshire
      United States of America (NDY)

    14. Lee Winter says:

      Apparently this also did not get through to you:
      ——————————–
      Follow up on link editing (was missing titles).

      Using version 1.2.3 has deepened the mystery.

      Objective fact #1: there are broken fragments on
      tactile-revolution.org/ask/when/history/ to parts of
      tactile-revolution.org/ask/who/people/. We know this to be fact for
      two reasons: (a) I damaged the links by changing the definitions of
      the target fragments, and (b) applying
      http://validator.w3.org/checklink/ to the first page mentioned about
      produces these results (among others):

      Objective fat #2: BLC does not detect the above-described broken
      links. See attached screen shot.

      Since I was able to obtain reports of fragments with version 1.2.3
      last week something is interfering. I could not find a likely culprit
      amoung the settings. A summary of the settings in use is also
      attached.

      I will try disabling plugins to see if they have any effect. Can you
      think of any likely culprits?

      Lee Winter
      – Hide quoted text –
      Nashua, New Hampshire
      United States of America (NDY)

      (attachments not included)

    15. Lee Winter says:

      Present tense:

      It appears that 1.2.3 -> 1.2.4 is not part of the problem. Both versions consistently do not find broken fragment links that the W3 link checker does find.

      This is a more important issue than the vanishing title= attribute.

      I may have done something to cause this behavior, but I have reproduced it with no plugins active at all and then a fresh install of BLC. So where else should I look for something that might be interfering with BLC’s scans?

      The page containing the fragments is /ask/who/people/.
      The page containing the broken links is /ask/when/history/.

      The only other odd thing I’ve noticed is that W3 link checker skips (zero elapsed time, zero links) a couple of large pages (both over 50KB). I broke one of the big pages up into about 20 smaller pages, but w3 link checker still skips it. So there may be something strange that is interfering with crawling the site.

      The big page that is still big is /ask/what/glossary/dimensions.
      The big page that I broke into smaller pieces is /ask/what/glossary/braille-terms/.

      Is there a way to enable a more verbose behavior of BLC?

    16. White Shadow says:

      I’m a bit ill right now (probably the flu) and not in any shape to do debugging, but I will note this: BLC does not check for broken fragment links. At all. If a link that contains an invalid #fragment shows up as broken, it is definitely for some other reason.

    17. Lee Winter says:

      OK, that explains the behavior I am seeing. Thanks for the info.

      Have you considered adding checks for fragment links? The W3 link checker source is available.

    18. White Shadow says:

      If memory serves, you’re the first person to suggest it. I’ll add it to my list of possible improvements.

    19. gekido says:

      I’m getting a dramatic failure trying to activate the plugin on a wordpress 3 multisite installation. This is one of my ‘must use’ plugins for any wordpress install so this is a sad day ;{

      Any chance of an update that is ‘network aware’?

      Thanx for the great plugin – keep up the good work!

    Leave a Reply