Broken Link Checker for WordPress
Sometimes, links get broken. A page is deleted, a subdirectory forgotten, a site moved to a different domain. Most likely many of your blog posts contain links. It is almost inevitable that over time some of them will lead to a “404 Not Found” error page. Obviously you don’t want your readers to be annoyed by clicking a link that leads nowhere. You can check the links yourself but that might be quite a task if you have a lot of posts. You could use your webserver’s stats but that only works for local links.
So I’ve made a plugin for WordPress that will check your posts (and pages), looking for broken links, and let you know if any are found.
Download it now! (423 KB)
Features
- Detects links that don’t work and missing images.
- Periodically checks links in posts, pages, comments and the blogroll.
- New and modified entries are checked ASAP.
- Notifies you on the Dashboard if any problems are found.
- Lets you edit all instances of a specific link at once.
- Gives you a list of all links ever posted on your site, with the ability to search and filter it.
- Lets you apply custom CSS styles to broken and removed links.
- Highly configurable.
The broken links show up in the Tools -> Broken Links tab along. If any invalid URLs are found a notification will also show up on the Dashboard widget. To save screen real-estate, the widget can be configured to stay closed most of the time and automatically expand when broken links are detected.
Installation
Install “Broken Link Checker” just like any other WordPress plugin :
- Download the .zip file.
- Unzip.
- Upload the
broken-link-checkerfolder to you/wp-content/pluginsdirectory. - Activate the plugin in the Plugins tab.
Download
broken-link-checker.zip (423 Kb)
Requirements
- WordPress 3.0 or later
- MySQL 4.1 or later
The current version of this plugin is only compatible with WordPress 3.0 and up. If you have an older version of WP, try one of the older releases. Specifically, version 0.8.1 is the last one that’s still compatible with the WP 2.8 branch, and version 0.4.14 is the last one compatible with WP 2.1 – 2.6.x.
Related posts :
@White Shadow – @White Shadow – Sure, it’s italian
However if I deactive the plugin, all things works right again… See you
[...] for almost 6 years and have not made many changes to it recently. I had, however, installed the Broken Link Checker plugin and thought that was the culprit. Anyway, I told my host that I would remove the problematic [...]
[...] Broken Link Checker – This plugin will monitor your blog and looks for any broken links and let you know if any broken links are found. [...]
[...] Broken Link Checker scans all your articles and their internal links to verify that none of them are broken. I haven’t kept this plugin continuously active, though did check my entire site after upgrading and changing many of my categories, and, happily, there were no broken links. [...]
[...] the WordPress Broken Link Plug In, I found that along with the hundreds of links to my old web site I had over 160 broken links on my [...]
BLOATWARE ALERT… caused my blog to take FOREVER to load… probably because I have LOTS of content… but still! This is hardly “running invisibly in the background”!!!
@Bleh – Depends on your browser… I’ve had no problems with it
This is a great plugin! Except that it’s reporting links to nearly all Wikipedia pages as broken, while they not. Such as:
http://en.wikipedia.org/wiki/Pandora
http://ja.wikipedia.org/wiki/%E7%A5%9E%E3%81%AE%E9%9B%AB
Is there something funny about these Wikipedia pages? Or is it just that the Wikipedia web sites can’t be reached from my host?
@hillman – Wikipedia links work okay (i.e. not detected as “broken”) on my blogs, but some other users seem to have problems with them. Maybe it’s really a problem with your server, or maybe Wiki is blocking certain ranges of IP addresses.
@White Shadow – I see. Perhaps I’ll just discard those errors. Thanks for the reply!
Does Broken Link Checker keep a log of the checking result? Like, whether it reports the link as broken coz it gets a 404 or 503 when checking?
Also, one of the link in my blog is linking to:
http://en.wikipedia.org/wiki/Pandora‘s_box
but BLC is reporting the link as:
http://en.wikipedia.org/wiki/Pandora
Does it stumble on single quote in the URL?
Xenu Link Sleuth produces some sort of error message on Wikipedia links (“forbidden”), I am guessing the same thing that stops XLS also blocks the BLC4WP.
@hillman – There is no log.
The plugin can probably get confused by links that contain a quote. I’ll need a more advanced link-finding regexp to solve that, but it’s doable.
The official regex for URIs is
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?Confused yet?
Nah, it makes perfect sense
Actually I need a regexp for HTML links. I think I’ll need to update the existing one to use a backreference to ensure it only accepts matching types of quotes for the opening and closing quote of the href parameter (I think that sentence got a bit tangled).
Anyway, I’m putting that off, because currently I have a more pressing issue – finding out how somebody managed to hack this site today. I’ve got it back working, but now I need to read the logs and whatnot…
[...] فى وقتها على زوار لمشاهدتها) 15-Auto Social Poster مثله مثل السابق ولكن فى اختلاف بعض المواقع 16-Broken Link Checker يبحث عن الروابط المتكسرة فى المدونة وينبهك لاصلاحها 17-Global Translator هاك [...]
[...] plugin Broken Link Checker habe ich erst kürzlich eingebaut. es läuft im hintergrund und scannt alte postings nach [...]
OK, more on my broken link checker stopping in the queue. It appears to get hung up when checking a link to http://www.phl.org/ . The site never loads up, the curl call never seems to actually time out as it should, and so the AJAX call comes back with a 504 (Gateway Timeout) error. When I manually removed this link from the blc_linkdata table, it went happily on checking the rest of the queue.
@Michael Hampton – Hmm, I guess I’ll have to think of something to detect cases like that.
Well, curl should be timing out, but it isn’t. Looks like the CURLOPT_TIMEOUT isn’t being honored. I don’t know if that’s a bug in curl or in PHP, but it’s certainly a bug.
Anyway, I worked around it by just having the link checker pull links from the work queue randomly, i.e.:
/* check the queue and process any links unchecked */$sql="SELECT * FROM $linkdata_name WHERE ".
" ((last_check<'$check_treshold') OR ".
" (broken=1 AND check_count<5 AND last_check<'$recheck_treshold')) ".
" ORDER BY RAND() LIMIT 100";
Now it’s at least checking most of them.
@Michael Hampton – I seem to recall that “ORDER BY RAND()” is considered a bad thing performance-wise.
Sure, it’s a performance problem if you’re randomizing a few million records. Randomizing 100 is trivial.
P.S. Please fix your comment form. I’m going nuts having to re-type my information in all over again every time.
@Michael Hampton – I’ll use a different workaround. The new version should be up soon.
I think one of my antispam plugins is causing the comment problem. I’ll investigate, but it might take a while.
really works thanks
Hello, nice to meet you, Janis.
This is nice plugin.
I fixed a lot of broken links on my blog.
But, first time, I got a error below at “Status : ” area on admin page:
Fatal error: Cannot redeclare class wpdb in /var/www/wp-includes/wp-db.php on line 53
So, I remove the line
require_once(“../../../wp-includes/wp-db.php”);
from wsblc_ajax.php , thinking it was negative solution.
Today, I found an article, “Tackle Plugin Compatibility Issues While Using Popular Libraries” : http://weblogtoolscollection.com/archives/2008/08/27/tackle-plugin-compatibility-issues-while-using-popular-libraries/
Yes, That might be the answer!!
I changed like below :
/*
The AJAX-y part of the link checker.
*/
require_once(“../../../wp-config.php”);
if(!class_exists(‘wpdb’)) {
require_once(“../../../wp-includes/wp-db.php”);
}
…
I GOT IT!!!
It potentially has same problem in require_once(“../../../wp-config.php”), maybe.
I wish my work will help you, thank you.
@Dai – While I haven’t encountered this problem before your suggested modification certainly can’t hurt. I’ll add it to the plugin
However, I think using “isset($wpdb)” instead of “class_exists(‘wpdb’)” will be a better choice because the comments in wp-db.php indicate that it is possible to replace this class by something else by setting the global variable $wpdb.
Including wp-config.php should be okay, that just loads the WordPress core.
[...] solution: Broken Link Checker for WordPress will check and detect both internal and outbound links that don’t work and notifies you on the [...]
Thank you for update.
> using “isset($wpdb)†instead of “class_exists(’wpdb’)†will be a better choice
I guess you’re right, actually, it works fine.
‘wp-db.php’ is not one of the “Popular Libraries” like PclZip, but a part of
WordPress system.
The article is may not suitable for this situation.
Anyway, the problem has gone.
I won’t have to edit when this plugin updated.
If you need, I’ll try to find out which plugin cause the conflict I wrote.
@Dai – Nah, it’s fine, though feel free to investigate if that helps your peace of mind.
This is a great plugin – found loads of dead links on one of my blogs. But on another, I can’t get it to run.
Both blogs are 2.6.1, and both are hosted on the same server, so I think there must be some interference with another plugin, one that I’m not using with the first, successful installation. Specifically, what happens is that it doesn’t run automatically, and when I click ‘re-check all pages’ in an attempt to trigger it, I get my ‘no results of a search’ page, suggesting other links, up for a few seconds. Then the default Broken Link Checker start page comes up again. Very odd!
I’ve tried deactivating all the obvious plugins – do you have any suggestions?
@Lucy – Hmm, here are some ideas :
* When you click the “re-check all” button note the URL of the “search page” (before it redirects). That might give some clues.
* Check your .htaccess. Maybe there are some security-related rules that are blocking parts of the plugin.
* Try disabling/enabling any cache-related plugins. My intuition tells me those could have something to do with it (somehow).
* Check file permissions/file owner of broken-link-checker.php and wsblc_ajax.php (especially the second one). The “right” permissions depend on your server configuration, but in general these files should both have the same permissions as the .php files of other, working WP plugins on your server.