Calculating Readability Metrics In PHP

Lorem IpsumReadability metrics, also known as readability formulas, are a set of algorithms that estimate the readability of text. Most tests are fairly primitive as they only take into account things like sentence length and the average number of syllables per word, but ignore deeper factors like sentence structure and semantics.

Still, readability metrics can be useful as a rough indicator of how understandable your writing is. Other possible uses include detecting smart-ass comments and even catching spam (obviously, most spam messages would score very low on a readability test).

If you want to do something like that in your own PHP application(s), take a look at PHP Text Statistics. It’s a PHP class that implements a number of readability tests, including :

It can also calculate other metrics like sentence count, syllable count, average words per sentence, and so on.

The class is very easy to use – just include one file, create an instance, and call one of the suggestively named methods. There’s not much “official” documentation but the source code is well-commented and easy to understand.

The only flaw that stuck out to me was how inefficient the script can be. You probably won’t notice it unless you need to analyze huge volumes of text, but it could really use some optimization. In fact, I managed to improve the performance by ~30% simply by eliminating a few redundant function calls (you can download the modified version here if you’re interested).

Ah well, it’s a useful class overall. I’m sure it will come in handy the occasional NLP exploit or mashup.

Hmm, this gives me an idea for a WordPress plugin….

Related posts :

2 Responses to “Calculating Readability Metrics In PHP”

  1. How to make a graph charts using PHP. Is it needs any specific module to present the data in pictorial and graphical view?

  2. White Shadow says:

    The simplest way would be to use a third-party charting API, e.g. Google Chart API

Leave a Reply