<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Advanced Spell Checker For WordPress</title>
	<atom:link href="http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/feed/" rel="self" type="application/rss+xml" />
	<link>http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/</link>
	<description>Slightly Advanced Computer Stuff (and some magic)</description>
	<lastBuildDate>Sat, 21 Nov 2009 04:22:17 +0200</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: WordPress Articles for june 9 2009 &#124; WPStart.org - WordPress themes, plugins and news</title>
		<link>http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/comment-page-1/#comment-30490</link>
		<dc:creator>WordPress Articles for june 9 2009 &#124; WPStart.org - WordPress themes, plugins and news</dc:creator>
		<pubDate>Tue, 09 Jun 2009 15:52:52 +0000</pubDate>
		<guid isPermaLink="false">http://w-shadow.com/?p=1138#comment-30490</guid>
		<description>[...] Advanced Spell Checker For WordPress After the Deadline is an advanced spell checker plugin for WordPress that was released on Monday. In addition to the standard spell check and suggestions features, it also includes style and grammar checking. - By W-Shadow       &#160;&#160;&#160;&#160; [...]</description>
		<content:encoded><![CDATA[<p>[...] Advanced Spell Checker For WordPress After the Deadline is an advanced spell checker plugin for WordPress that was released on Monday. In addition to the standard spell check and suggestions features, it also includes style and grammar checking. &#8211; By W-Shadow       &nbsp;&nbsp;&nbsp;&nbsp; [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dashnine Media &#187; AtD Updates</title>
		<link>http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/comment-page-1/#comment-30450</link>
		<dc:creator>Dashnine Media &#187; AtD Updates</dc:creator>
		<pubDate>Fri, 05 Jun 2009 17:41:32 +0000</pubDate>
		<guid isPermaLink="false">http://w-shadow.com/?p=1138#comment-30450</guid>
		<description>[...] Shadow has written up his reaction to AtD at his blog.  Bryan at CMS Report published the announcement for AtD.  And we [...]</description>
		<content:encoded><![CDATA[<p>[...] Shadow has written up his reaction to AtD at his blog.  Bryan at CMS Report published the announcement for AtD.  And we [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Raphael Mudge</title>
		<link>http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/comment-page-1/#comment-30423</link>
		<dc:creator>Raphael Mudge</dc:creator>
		<pubDate>Tue, 02 Jun 2009 16:49:06 +0000</pubDate>
		<guid isPermaLink="false">http://w-shadow.com/?p=1138#comment-30423</guid>
		<description>Oops, this the perfectionist scientist in me who can&#039;t edit comments: the numbers are 0.005 or 0.5%, 0.0011 or 0.11%, 0.0029 or 0.29%.  I apologize for the confusion I may have caused leaving the errant % in there.</description>
		<content:encoded><![CDATA[<p>Oops, this the perfectionist scientist in me who can&#8217;t edit comments: the numbers are 0.005 or 0.5%, 0.0011 or 0.11%, 0.0029 or 0.29%.  I apologize for the confusion I may have caused leaving the errant % in there.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Raphael Mudge</title>
		<link>http://w-shadow.com/blog/2009/06/02/advanced-spell-checker-for-wordpress/comment-page-1/#comment-30422</link>
		<dc:creator>Raphael Mudge</dc:creator>
		<pubDate>Tue, 02 Jun 2009 16:47:31 +0000</pubDate>
		<guid isPermaLink="false">http://w-shadow.com/?p=1138#comment-30422</guid>
		<description>Thanks for the write up.  Without any biasing I&#039;m able to detect misused words with 90% accuracy.  Unfortunately without any biasing I also flag correctly used words as misused.  This especially becomes a problem because folks are more likely to use a word correctly than incorrectly.  

So what I&#039;ve one is identified words the statistical model doesn&#039;t  work well for (it&#039;s/its, a/an, there/their, etc), moved them away from the misused word list, and created grammar rules for them.

This has given me some wiggle room to cut my biasing down.  I&#039;m aiming for a 0.0050 false positive rate--it&#039;s at 0.0011 and 0.0029% in my tests now.  Once I change this balance you&#039;ll notice the misused word detection is more accurate.  

What makes this problem challenging is for some words (depending on the context) either word could feasibly be correct.  The solution is to look at more context but the more context I choose to use the more danger I have of data sparseness leading to more false positives and less coverage of how the words are actually used.  It&#039;s an interesting problem.  

What is nice (from my perspective) is where one tool falls short, I have another tool I can use to make up the difference (such as the grammar checker).</description>
		<content:encoded><![CDATA[<p>Thanks for the write up.  Without any biasing I&#8217;m able to detect misused words with 90% accuracy.  Unfortunately without any biasing I also flag correctly used words as misused.  This especially becomes a problem because folks are more likely to use a word correctly than incorrectly.  </p>
<p>So what I&#8217;ve one is identified words the statistical model doesn&#8217;t  work well for (it&#8217;s/its, a/an, there/their, etc), moved them away from the misused word list, and created grammar rules for them.</p>
<p>This has given me some wiggle room to cut my biasing down.  I&#8217;m aiming for a 0.0050 false positive rate&#8211;it&#8217;s at 0.0011 and 0.0029% in my tests now.  Once I change this balance you&#8217;ll notice the misused word detection is more accurate.  </p>
<p>What makes this problem challenging is for some words (depending on the context) either word could feasibly be correct.  The solution is to look at more context but the more context I choose to use the more danger I have of data sparseness leading to more false positives and less coverage of how the words are actually used.  It&#8217;s an interesting problem.  </p>
<p>What is nice (from my perspective) is where one tool falls short, I have another tool I can use to make up the difference (such as the grammar checker).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
