Readability metrics, also known as readability formulas, are a set of algorithms that estimate the readability of text. Most tests are fairly primitive as they only take into account things like sentence length and the average number of syllables per word, but ignore deeper factors like sentence structure and semantics.
Still, readability metrics can be useful as a rough indicator of how understandable your writing is. Other possible uses include detecting smart-ass comments and even catching spam (obviously, most spam messages would score very low on a readability test).
If you want to do something like that in your own PHP application(s), take a look at PHP Text Statistics. It’s a PHP class that implements a number of readability tests, including :
- Flesch-Kincaid Reading Ease (probably the most popular metric)
- Flesch-Kincaid Grade Level
- Gunning-Fog Index
- Coleman-Liau Index
- SMOG index
- Automated Reability Index
It can also calculate other metrics like sentence count, syllable count, average words per sentence, and so on.
The class is very easy to use – just
include one file, create an instance, and call one of the suggestively named methods. There’s not much “official” documentation but the source code is well-commented and easy to understand.
The only flaw that stuck out to me was how inefficient the script can be. You probably won’t notice it unless you need to analyze huge volumes of text, but it could really use some optimization. In fact, I managed to improve the performance by ~30% simply by eliminating a few redundant function calls (you can download the modified version here if you’re interested).
Ah well, it’s a useful class overall. I’m sure it will come in handy the occasional NLP exploit or mashup.
Hmm, this gives me an idea for a WordPress plugin….Related posts :