<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Unravelling the black magic of bureaucracy</title>
	<link>http://blog.outerthoughts.com/2007/09/unravelling-the-black-magic-of-bureaucracy/</link>
	<description>&#62; From inner thoughts to the outer limits of Alexandre Rafalovitch</description>
	<pubDate>Sun, 20 Jul 2008 22:29:26 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
		<item>
		<title>By: Alexandre Rafalovitch</title>
		<link>http://blog.outerthoughts.com/2007/09/unravelling-the-black-magic-of-bureaucracy/#comment-17886</link>
		<dc:creator>Alexandre Rafalovitch</dc:creator>
		<pubDate>Mon, 21 Jan 2008 23:40:37 +0000</pubDate>
		<guid>http://blog.outerthoughts.com/2007/09/unravelling-the-black-magic-of-bureaucracy/#comment-17886</guid>
		<description>Training corpus is always an option. Problem is that good training course takes up a very long time and often it is all one manages to do for a PhD. I am hoping to do things in smarter way with building corpus kept as a fallback option.</description>
		<content:encoded><![CDATA[<p>Training corpus is always an option. Problem is that good training course takes up a very long time and often it is all one manages to do for a PhD. I am hoping to do things in smarter way with building corpus kept as a fallback option.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gumby Bear</title>
		<link>http://blog.outerthoughts.com/2007/09/unravelling-the-black-magic-of-bureaucracy/#comment-17669</link>
		<dc:creator>Gumby Bear</dc:creator>
		<pubDate>Thu, 10 Jan 2008 15:51:14 +0000</pubDate>
		<guid>http://blog.outerthoughts.com/2007/09/unravelling-the-black-magic-of-bureaucracy/#comment-17669</guid>
		<description>Have you considered marking up a training corpus to allow better statistical prediction? The reason that the statistical methods don't resolve very long names as being entities is because they are not trained with them. If you train your resolver on very large names, it will have less of a problem resolving them!

You could try checking the GATE manual for details, but it should be possible with any of the larger annotation frameworks.</description>
		<content:encoded><![CDATA[<p>Have you considered marking up a training corpus to allow better statistical prediction? The reason that the statistical methods don&#8217;t resolve very long names as being entities is because they are not trained with them. If you train your resolver on very large names, it will have less of a problem resolving them!</p>
<p>You could try checking the GATE manual for details, but it should be possible with any of the larger annotation frameworks.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
