Broken Link Checker for WordPress
Sometimes, links get broken. A page is deleted, a subdirectory forgotten, a site moved to a different domain. Most likely many of your blog posts contain links. It is almost inevitable that over time some of them will lead to a “404 Not Found” error page. Obviously you don’t want your readers to be annoyed by clicking a link that leads nowhere. You can check the links yourself but that might be quite a task if you have a lot of posts. You could use your webserver’s stats but that only works for local links.
So I’ve made a plugin for WordPress that will check your posts (and pages), looking for broken links, and let you know if any are found.
Download it now! (10 KB)
Features
- Checks your posts (and pages) in the background (whenever the WP admin panel is open ).
- Detects links that don’t work and missing images. Checks both internal and outbound links.
- Notifies you on the Dashboard if any problems are found.
- Link checking intervals can be configured.
- New/modified posts are checked ASAP.
The broken links show up in the Manage -> Broken Links tab. If any invalid URLs are found a notification will also show up in the sidebar on the Dashboard.
The Broken Links tab displays a list of invalid URLs found along with the relevant posts and the anchor text of the links. “View” and “Edit Post” do exactly what they say and “Discard” will remove the message about a broken link, but not the link itself (so it will show up again later unless you fix it; this plugin doesn’t modify your links).
By default all old posts/links are re-checked every 72 hours, or you can set a different time period.
Notes (Semi-Technical)
I realize there’s a lot of features that could be added to improve this plugin considerably. However, this release is intended to “test the waters” and see if there’s demand for a plugin like this, so I only implemented the most basic functions. The plugin has been upgraded to be slightly beyond “basic”
I thought about using WP’s pseudo-cron to run the link checker by schedule and decided against it. AFAIK the cronjobs execute when a page is requested; since this plugin does some lengthy processing it may increase page load times unacceptably when used in this manner. That’s why I set it to run the checks asynchronously (AJAX) and invisibly in the admin panel.
Installation
Just like any other WordPress plugin -
- Download (see below).
- Unzip.
- Upload the broken-link-checker folder to you wp-content/plugins directory.
- Activate the plugin in the Plugins tab.
Upgrading
- Deactivate the plugin (important!).
- Do steps 1.-3. from “Installation”.
- Upload the broken-link-checker folder to you wp-content/plugins directory.
- Re-activate the plugin in the Plugins tab.
Download
Version 0.3.5 : broken-link-checker.zip (10 Kb)
(It needs at least WordPress 2.0.x to work, maybe 2.1.x. I’ve tested on 2.1.3 - 2.5)
June 16th, 2008 at 5:54 pm
Nice plugin! Thanks a lot! I hope that in future version could check broken links in comments too.
May 26th, 2008 at 2:59 pm
[...] Broken Link Checker: And lastly I like this plugin because it shows me what posts have broken links on them so I can correct the problems and eliminate future seo problems because of missing pages! Last 10 posts in Readers QuestionsQuestions Need Answers #1 - May 24th, 2008Advertisments For Technology - April 14th, 2008What Are Authority Sites? - April 3rd, 2008Readers Questions - February 4th, 2008 Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages. [...]
May 22nd, 2008 at 3:10 am
Yeah… I’m seeing it happening now on multiple blogs, each of which aren’t having load time issues and are also running plug-ins like Bad Behavior, AntiLeech and/or others (many of the URLs reported as invalid are internal)… I’ll play with different configurations to try to pin down where the incompatibility lies. thx
May 21st, 2008 at 9:14 pm
It’s possible that those links that are incorrectly reported as broken point to websites that employ some kind of anti-bot protection, which might block the link checker. This has already happened before.
Another possibility is that the “broken” links load very slowly, so the connection times out and the plugin assumes the links don’t work.
May 21st, 2008 at 9:02 pm
This is a very cool plug-in but I’m having some trouble with it. It’s constantly reporting many valid URLs as if they were invalid, aside from catching the occasional invalid one. I’ve experimented with the settings to no avail. I’m running 2.5… Any suggestions? Thanks
May 15th, 2008 at 9:20 pm
[...] noarchive) along with Nofollow Case by Case. Add in Google XML Sitemaps, Ozh’ Better Feed, Broken Link Checker and PagerFix if/as applicable, for bonus [...]
May 15th, 2008 at 1:51 pm
I didn’t figure it out at first, but now I know it doesn’t work because I haven’t added any specific “image-unlinking” code. So the “Unlink” button only works for links.
Maybe one day
May 15th, 2008 at 1:31 pm
hey white shadow, as mentioned in my pioneer (around 10 to 20) comment earlier in the development of your genius plugin - I do still get the unlink error.
only occurs with images (wikipedia and my own), do you have any idea how to fix that?
cheers,
jez
May 13th, 2008 at 12:40 am
Yeah I was about to tell you to do what Leo said, it’s definitely better and more logical.
May 12th, 2008 at 8:16 pm
Interesting, that seems to work. I’ll have a new version up soon, thanks
May 12th, 2008 at 6:14 pm
You can probably use get_permalink, giving it the post’s ID as the parameter. This function is provided by WordPress core.
Anyway I’ve not tried that yet. Good luck with that! We can find a way out together if you are faced with any coding problem.
May 12th, 2008 at 6:05 pm
Ahha, and how do I get the permalinks? (It’s probably not extremely hard, but I didn’t figure it out when writing that code)
May 12th, 2008 at 12:28 pm
Hi,
Thanks for your amazing plugin!
I have a request for an enhancement.
I went through the source code and noticed that the “View” links are generated according to the the GUIDs. This may cause problems on non-English blogs, such as my blog in Chinese. I think using permalinks is a better approach. Hope this helps!
Thanks again.
Regards
Leo
May 11th, 2008 at 11:32 pm
I suspect nearly 30% of my posts are about new plugins/features. So I’d rather skip posting about small updates
May 11th, 2008 at 11:10 pm
Yeah I figured that out by viewing your plugin’s source, I simply ran a REPLACE query on the guid table so that’s fixed now. The only problem is that some of the guid’s are of when I used different permalinks (for example /p?=432) and as far as I can tell there’s no way of fixing that (I already tried update permalinks). Oh well, but hey at least your plugin is working perfectly fine. If I ever notice anything then I’ll hit you up.
You should announce this new feature man, I’m positive you’ll get a lot of attention for it and many people will love it. It would’ve been neater if you could do it like the slug editing is: The URL is just text and when one clicks on it it becomes the textbox and when one clicks outside of it or on the save link/button it saves it (If not the former then at least the latter). This would be a lot more intuitive and make for a smoother process. But hey at least it’s working now, perfectly fine too! This suggestion is just as a touch up.
I’ll let you know if anything happens but so far the plugin is working great. If you need anything let me know!
May 11th, 2008 at 10:24 pm
Well, you never know when a bug might crop up
The plugin gets the URL for the “View” link from the WordPress database (using the “guid” filed). Some of your older posts probably have outdated database info. I think.
May 11th, 2008 at 10:12 pm
Hmm…don’t know what errors you’re talking about, where do you get them and when? I’ve tried the new plugin and it works perfectly, really, I’ve tried it multiple times.
One weird thing is that when I click view post on some of them, it uses my old domain name, only on posts that were created when I had that domain name, but I’m not sure if it’s your plugin’s own doing.
May 11th, 2008 at 1:23 pm
Ahem. It definitely isn’t easy. I tried, and got weird and unusual MySQL errors that I couldn’t trace to any particular source. Still, I’ve uploaded an experimental implementation to wordpress.org (you should get an “upgrade available” notice in your “Plugins” tab).
May 11th, 2008 at 3:28 am
White Shadow, you already have everything you need to implement what I mentioned (The click on the link and it shows an input box for faster link fixing). You already have the Regular Expressions in the wsblc_ajax.php file, just make a callback like unlink_link_callback and use a different $matches variable (Maybe $matches[0] I think is the href, not positive), the SQL queries already seem to be fine. The only major change will be adding the input box and all of that, and you can see an example of that in the wp-admin\js\slug.js file, the first function, edit_permalink. This function is the one that shows the input box and all of that.
I’m not trying to say it’s absolutely easy and bashing you for not doing it or anything, I would do it if I had a better understanding of this, just trying to help you out with some clarifications. If you ever want to give it a try, I’ll be glad to test it out for you. Like I said earlier, I’m /positive/ that every one of your users will benefit from this. It’d definitely be faster than having to click edit post for each post and look through the long post to find the link. Not trying to nag or anything though I know it’ll probably seem this way, just trying to show that I really care about this feature enough to investigate.
May 1st, 2008 at 8:10 pm
thanks, I’ll try that.
May 1st, 2008 at 7:44 pm
The option that you can set is how often the plugin is invoked (how often it runs). So if you have it at 5 seconds, it will
1) start up every five seconds (launched by JavaScript from the admin panel)
2) run for five seconds (approximate)
3) repeat from 1. while there are links to check
The intention is to use all the alloted time, so in an ideal case the plugin will run continuously - until there are no more links to check.
If it slows down you dashboard too much, try setting it to very long intervals instead.
May 1st, 2008 at 7:03 pm
At what interval does it check ? I have it set to 5 seconds for how long it checks, but my blog slows way down for a page request when it is running. If I could set this plugin to run for 5 seconds once every 60 seconds that would be great. This page slowdown is only there when the plugin is active and I’m in the admin panel. I think.
I broke my site during the wp2.5.1 upgrade, reloaded and now the database links have shuffled. So I probably have many broken links or static links that go to the wrong post. I had deleted old posts here and there, and when I reuploaded the database backup, mysql redid the post numbers filling in the blank spots.
Thanks,
D.J.
April 30th, 2008 at 1:07 am
[...] using any widgets, but… I am using a few plugins: the Broken Link Checker and the Dashboard Widget Manager. (The latter let’s you clear stuff you don’t want to look at out [...]
April 26th, 2008 at 3:04 am
Hello,
First off thank you very much for writng this plugin, great idea! The plugin was working great. But now the pages are not redirecting. In Explorer the pages gets hung up indefinitely, in Firefox it gives this error.
The page isn’t redirecting properly
Firefox has detected that the server is redirecting the request for this address in a way that will never complete.
* This problem can sometimes be caused by disabling or refusing to accept
cookies.
Any ideas of what could have happen? Are there plugins not compactable with Link Cloaker?
Thank you VERY MUCH for your time.
Bert
April 25th, 2008 at 6:58 pm
Thank you a lot! I will do it with wsblc_ajax.php. Have a nice weekend!
April 25th, 2008 at 3:34 pm
Not in the current version, but if you know PHP, it should be pretty easy to modify wsblc_ajax.php to work with cron. You’d need to slightly modify it so that wsblc_ajax.php?action=run_check doesn’t require the user to be logged in.
I’m not going to add official cron support to the plugin because many users don’t have the option to use cron.
April 25th, 2008 at 2:20 pm
Thank you very much!
This is a great plugin and made possible to for me to check and fix more than 3000 pages with 9000 links.
Is there a possibility to start it as cron job?
April 21st, 2008 at 1:26 am
It’s regexps + some hooks to know when a post if modified. You’d probably need to access the DB directly to modify a post (there’s probably an internal function for that, but I’ve found the documentation pretty spares in this regard).
I have a general idea of how to do the box thing at the moment… And sure, feel free to send me any code you might come up with.
April 21st, 2008 at 1:15 am
”
* Also check for missing CSS files.
* And missing JavaScript.”
I understand your enthusiasm but this doesn’t necessarily have to be in a broken LINK checker, as broken link checker implies hyperlinks, nevertheless it would be a great feature.
Really though, not trying to nag you or anything, but wouldn’t this be a simple thing to do? Just take a look at the code in 2.5 that handles the following: when you go to write a post, it automatically generates a post slug under the post title. You can click on this slug and it’ll turn into a box with which you can change the post slug. Just use this same code to do the box thing. Then in the manner that you found the link within the post (Regex? Or did you hook onto a WP function? I’ll look at the plugin code later), change the link within the post.
I might try and hack something together later, I’ll give you whatever I could come up with, if anything.
April 21st, 2008 at 12:05 am
Complete, you say? Ah no, not even close. To be “complete” it would need to do at least the following :
* Also check for missing CSS files.
* And missing JavaScript.
* Check blogroll links.
* Check links in comments.
* Actually, check all links.
* Have a link checking function that doesn’t trigger any bot detectors.
* Gracefully handle “soft” 404 - when a site doesn’t send the expected 404 Not Found HTTP response, but displays a search page or something like that instead.
* Work without CURL (for compatibility).
* And so on (ad infinitum?).
There are always more ideas than there is time (or enthusiasm, or skill)