Broken Link Checker for WordPress
Sometimes, links get broken. A page is deleted, a subdirectory forgotten, a site moved to a different domain. Most likely many of your blog posts contain links. It is almost inevitable that over time some of them will lead to a “404 Not Found” error page. Obviously you don’t want your readers to be annoyed by clicking a link that leads nowhere. You can check the links yourself but that might be quite a task if you have a lot of posts. You could use your webserver’s stats but that only works for local links.
So I’ve made a plugin for WordPress that will check your posts (and pages), looking for broken links, and let you know if any are found.
Download it now! (40 KB)
Note : This page, and the feature list below are slightly out of date as a major update has been released recently (see details). I’ll get around to updating this page eventually.
Features
- Checks your posts (and pages) in the background (whenever the WP admin panel is open ).
- Detects links that don’t work and missing images. Checks both internal and outbound links.
- Notifies you on the Dashboard if any problems are found.
- Link checking intervals can be configured.
- New/modified posts are checked ASAP.
The broken links show up in the Manage -> Broken Links tab. If any invalid URLs are found a notification will also show up in the sidebar on the Dashboard.
The Broken Links tab displays a list of invalid URLs found along with the relevant posts and the anchor text of the links. “View” and “Edit Post” do exactly what they say and “Discard” will remove the message about a broken link, but not the link itself (so it will show up again later unless you fix it; this plugin doesn’t modify your links).
By default all old posts/links are re-checked every 72 hours, or you can set a different time period.
Notes (Semi-Technical)
I realize there’s a lot of features that could be added to improve this plugin considerably. However, this release is intended to “test the waters” and see if there’s demand for a plugin like this, so I only implemented the most basic functions. The plugin has been upgraded to be slightly beyond “basic”
I thought about using WP’s pseudo-cron to run the link checker by schedule and decided against it. AFAIK the cronjobs execute when a page is requested; since this plugin does some lengthy processing it may increase page load times unacceptably when used in this manner. That’s why I set it to run the checks asynchronously (AJAX) and invisibly in the admin panel.
Installation
Just like any other WordPress plugin -
- Download (see below).
- Unzip.
- Upload the broken-link-checker folder to you wp-content/plugins directory.
- Activate the plugin in the Plugins tab.
Upgrading
- Deactivate the plugin (important!).
- Do steps 1.-3. from “Installation”.
- Upload the broken-link-checker folder to you wp-content/plugins directory.
- Re-activate the plugin in the Plugins tab.
Download
Version 0.5.3 : broken-link-checker.zip (40 Kb)
Requirements
- WordPress 2.7 or later
- MySQL 4.1 or later
Starting with version 0.5 this plugin is only compatible with WordPress 2.7 and up. Older versions (e.g. ver. 0.4.14) should work with WP 2.1 – 2.6.x.
Related posts :
[...] Broken link checker: Se encarga de revisar los enlaces salientes de nuestro blog y comprobar que estos están activos aún o no. Muy útil para artículos antiguos. [...]
Every time I update Broken Like Checker, I have the same 15 broken links appear. All of them contain “2009/08/category/” in the URL. I remove this from the link and every thing is fine until the next upgrade. Is there something within the plugin that creates this problem? Or is it being created by some other plugin?
Hi there!
I have a problem with your plugin… every time I update it, I have to change one line (this update l. 2218) into $path = ini_get(‘upload_tmp_dir’);
It is the opendir restriction, so do me a favor please and change it
Have a nice day,
Asmodiel
@ Keyword-SEO-guy : I’m guessing those links use relative URLs, right? This is a bug in the plugin; I’m uploading a fixed version to wordpress.org as I write this.
@ Asmodiel : Alright, though I’ll a bit differently
re: 809
My ISP upgraded the CURL library. I checked the plugin’s debug info:
PHP version 5.2.5
MySQL version 5.0.41-community
CURL version 7.19.5
Snoopy Installed
Safe mode Off
open_basedir On
Redirects may be detected as broken links when open_basedir is on.
Lockfile /tmp/wp_blc_lock
however. I see that open_basedir shows as on, but Support has denied that it’s actually turned on. They referred me to http://jade.rahul.net/~ldhesi2/info.php which lists config info for the server, in which I see that open_basedir has “no value”
So the CURL issue is dealt with, but I’m not sure about the open_basedir situation, and meanwhile, I rescanned everything and there are no redirects showing.
Thanks for your ongoing help with this!!!
Regarding open_basedir, the plugin just reports what PHP tells it. I’ll make it display the value it sees for the open_basedir setting, maybe that will provide some clues.
I noticed that your broken-link-checker is not internationalized. Please consider to do so, it’s not much work and might widen your range of fans
. I made a start by adding a domain name to your __(‘textstrings-to-be-localized’) and the 1 _e(‘textstrings-to-be-localized’) you have in your broken-link-checker.php page. Not all english terms are hereby translated though. You might want to look at the http://codex.wordpress.org/I18n_for_WordPress_Developers/ page.
I uploaded the edited broken-link-checker.php, together with the broken-link-checker-nl_NL.po and mo file in a zip file to:
http://rapidshare.com/files/294540358/broken-link-checker.zip
HI,
I agree with dingoe, as I told you some weeks ago… i can also add Italian po file.
byez
G.
re: 817
Here’s what the debug info says now (after your update)
PHP version 5.2.5
MySQL version 5.0.41-community
CURL version 7.19.5
Snoopy Installed
Safe mode Off
open_basedir On ( /home/portigal/:/tmp:/usr/local/lib/php/ )
Redirects may be detected as broken links when open_basedir is on.
Lockfile /tmp/wp_blc_lock
Interesting (to my naive self) that the open_basedir is giving a different readout here than http://jade.rahul.net/~ldhesi2/info.php which says “no value”
Interesting, yes.
I was able to check /usr/local/lib/php.ini
[relevant lines, via grep, but I can get more of the file if need be]
; open_basedir, if set, limits all file operations to the defined directory
;open_basedir =
So there’s no value set there. But I suppose something else could be changing the setting – another plugin, something in WP, etc. I don’t really know what it is or if that’s even possible, but I wonder if that’s a source for investigating?
According to PHP docs the open_basedir directive can only be set in php.ini and httpd.conf in your version of PHP. You’d need at least PHP 5.3.0 to set it from a script, so I’m fairly certain plugins/WP can’t change it in your case (PHP 5.2.5).
Maybe open_basedir isn’t the problem here. You could try (re-)enabling redirect detection manually : comment out lines 161 and 163 in link-classes.php like this :
I tried to deactivated and reactivated the plugin but the error on the table is still there
Did you get an error when reactivating (you should have if it couldn’t create the table)? If yes, what was it?
for my own Wordpress blog I have rewritten the broken-link-checker.php file and made my own broken-link-checker-nl_NL.mo file. I will post this rewritten file and the broken-link-checker.pot file in a zipped file on Rapidshare for those who want a localized plugin.
http://rapidshare.com/files/295841715/broken-link-checker.zip
[...] Broken link checker – Helps my users and is good for SEO by identifying broken links within my sites. From the dashboard you can unlink or edit the link to something that will work. [...]
Hi there!
The Problem with the open_basedir as I’ve written above is not really fixed with the last update. While using the proposed ini_get(’upload_tmp_dir’); it works… Maybe you should really consider changing it
I could write a german and polish localisation….
Interesting. As a test, could you try switching the order the plugin tries the directories like this (around line 2220) :
The reason I’m reluctant to implement your suggestion exactly as written is because it doesn’t account for other users’ situation. (What if someone doesn’t have an upload dir. set? What if it gets automatically cleared at inopportune moments? Etc.)
So I see there’s great demand for localisation(s). Okay, I’ll get started on that sometime next week
I installed this plugin but it immediately paralyzed both backend and frontend. I’d better use Xenu’s Link Sleuth for now.
re: 823
(more naive questions, sorry) Where do I find link-classes.php? It’s not part of the WP install as far as I can tell…
It’s one of the plugin’s files, in wp-content/plugins/broken-link-checker/.
Well, I tried that. I rechecked all links and then I created a post with http://chittahchattah.blogspot.com which redirects to http://www.portigal.com/blog
1. Post published on : October 23, 2009
2. Link last checked : October 23, 2009
3. HTTP code : 200
4. Response time : 0.215 seconds
5. Final URL : http://chittahchattah.blogspot.com
6. Redirect count : 0
7. Instance count : 1
1. Log : === First try : 200 ===
HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Expires: Sat, 24 Oct 2009 05:26:05 GMT
Date: Sat, 24 Oct 2009 05:26:05 GMT
Cache-Control: public, max-age=0, must-revalidate, proxy-revalidate
Last-Modified: Mon, 13 Jul 2009 22:44:07 GMT
ETag: “5bd197ab-8424-4f4a-aacb-ebb0252da666″
X-Content-Type-Options: nosniff
X-XSS-Protection: 0
Content-Length: 0
Server: GFE/2.0
Link is valid.
Debug info:
PHP version 5.2.5
MySQL version 5.0.41-community
CURL version 7.19.5
Snoopy Installed
Safe mode Off
open_basedir On ( /home/portigal/:/tmp:/usr/local/lib/php/ )
Redirects may be detected as broken links when open_basedir is on.
Lockfile /tmp/wp_blc_lock
In other words, it didn’t make any difference. No redirects out of my 7000+ links
[...] – Broken Link Checker scannera régulièrement vos pages, et vous indiquera, sur le dashboard d’administration, [...]
http://chittahchattah.blogspot.com uses a META redirect, which is not directly supported by CURL and similar libraries.
But that still doesn’t explain why other redirects wouldn’t show up, and I’m running out of ideas. Khm. Okay, here’s one more test – download this archive, unzip it, upload the test_redirect.php file to your server and run it. It should output a bunch of information; paste it here.
The script is basically a greatly simplified variant of the link checking code that tries to load http://w-shadow.com/redirect302.php – a page I created specifically to test redirects – and displays various information about the result(s).
[...] Página do Broken Link Checker [...]
Thanks – Note that I still have the commented-out version of link-classes.php
Fetching http://w-shadow.com/redirect302.php
Warning: curl_setopt() [function.curl-setopt]: CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set in /home/portigal/domains/portigal.com/public_html/test_redirect.php on line 10
HTTP Header(s)
HTTP/1.1 302 Found
Date: Sat, 24 Oct 2009 16:58:40 GMT
Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.7a mod_bwlimited/1.4 PHP/5.2.10
X-Powered-By: PHP/5.2.10
Location: http://w-shadow.com/
Vary: User-Agent,Accept-Encoding
Content-Type: text/html; charset=UTF-8
sorry for the multiple comments – the output is in a format that your comment feature doesn’t like, giving me a blank screen upon submit, so I’m snipping out various pieces to see how I can get it to load – I’ll reformat it in Word and see if that helps…
Response info
Array
url => http://w-shadow.com/redirect302.php
content_type => text/html; charset=UTF-8
http_code => 302
header_size => 280
request_size => 67
filetime => -1
ssl_verify_result => 0
redirect_count => 0
total_time => 4.325351
namelookup_time => 4.165701
connect_time => 4.244815
pretransfer_time => 4.244826
size_upload => 0
size_download => 0
speed_download => 0
speed_upload => 0
download_content_length => -1
upload_content_length => -1
starttransfer_time => 4.325252
redirect_time => 0
CURL version info
Array
version_number => 463621
age => 3
features => 1565
ssl_version_number => 0
version => 7.19.5
host => x86_64-unknown-linux-gnu
ssl_version => OpenSSL/0.9.8b
libz_version => 1.2.3
protocols => Array
0 => tftp
1 => ftp
2 => telnet
3 => dict
4 => ldap
5 => http
6 => https
7 => ftps
Right. That warning there – “CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir” – is a clear indication that open_basedir is enabled, and thus redirects don’t work. I was uncertain before, but not any more – this particular error message is generated by PHP itself when a script tries to do something that’s not allowed by the server config.; it can’t be a bug in the script*.
I’ll add a workaround for situations like this. Redirects will show up in the report, but you’ll have to check each one manually to know where it redirects to (because the plugin still won’t be able to follow the redirect).
* That sounded ambiguous. What I mean is the error message shows that the server config. really prevents the plugin from following redirects, as opposed to the plugin just incorrectly thinking it can’t follow them and giving up.
Thank you for this unbelievably useful plugin. I’ve used it almost since it was first released and find it really wonderfully useful on my blogs for isolating sites I’ve linked to that have since died or moved. The growth in the plugin during the last 2 years has been spectacular too!
I’m not sure if this is possible, or if it’s just me, etc., but the one thing that really bugs me is that when I’m editing a URL on the broken links page that I can’t just press ‘enter’ to save the changed link – I have to manually click ’save link’. I am forever forgetting to do that because it seems counterintuitive that ‘enter’ doesn’t work.
The only other tinsy tiny problem I have is that I’d like more documentation – I’m never quite able to remember the differences/effects of “unlink”, “discard”, and “exclude”. Could we have tooltips perhaps for these? I know I’ve looked up the effects a few times but the names aren’t obvious to me so I keep forgetting.
Again, thanks for the hard work you put in – it’s greatly appreciated!
r