<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Bulk converting doc files into txt (or html)</title>
	<atom:link href="http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/</link>
	<description>&#62; From inner thoughts to the outer limits of Alexandre Rafalovitch</description>
	<lastBuildDate>Mon, 03 Oct 2011 22:01:43 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
	<item>
		<title>By: DrNI</title>
		<link>http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/comment-page-1/#comment-1718</link>
		<dc:creator>DrNI</dc:creator>
		<pubDate>Mon, 08 Sep 2008 08:50:45 +0000</pubDate>
		<guid isPermaLink="false">http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/#comment-1718</guid>
		<description>You could have simply used antiword on the command line. The SVN version of the Web as Corpus ToolKit (WaC TK) includes a module that utilizes antiword to include DOC documents into a corpus.</description>
		<content:encoded><![CDATA[<p>You could have simply used antiword on the command line. The SVN version of the Web as Corpus ToolKit (WaC TK) includes a module that utilizes antiword to include DOC documents into a corpus.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: whist</title>
		<link>http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/comment-page-1/#comment-291</link>
		<dc:creator>whist</dc:creator>
		<pubDate>Sun, 29 Jun 2008 03:21:09 +0000</pubDate>
		<guid isPermaLink="false">http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/#comment-291</guid>
		<description>I had a similar problem a few years ago.  I finally managed to write a delphi pascal program that could convert doc to txt.  But it was very slow and had to be baby-sitted (sat?).  It&#039;s a perennial problems for corpus building.</description>
		<content:encoded><![CDATA[<p>I had a similar problem a few years ago.  I finally managed to write a delphi pascal program that could convert doc to txt.  But it was very slow and had to be baby-sitted (sat?).  It&#8217;s a perennial problems for corpus building.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexandre Rafalovitch</title>
		<link>http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/comment-page-1/#comment-288</link>
		<dc:creator>Alexandre Rafalovitch</dc:creator>
		<pubDate>Wed, 28 May 2008 12:05:09 +0000</pubDate>
		<guid isPermaLink="false">http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/#comment-288</guid>
		<description>Thanks Alex,

I never actually installed Google Desktop (not that I totally distrust Google or anything), so haven&#039;t seen those tools. But it is good to know they exist (or is it now &#039;existed&#039;?).</description>
		<content:encoded><![CDATA[<p>Thanks Alex,</p>
<p>I never actually installed Google Desktop (not that I totally distrust Google or anything), so haven&#8217;t seen those tools. But it is good to know they exist (or is it now &#8216;existed&#8217;?).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Jaculin</title>
		<link>http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/comment-page-1/#comment-289</link>
		<dc:creator>Alex Jaculin</dc:creator>
		<pubDate>Wed, 21 May 2008 10:54:55 +0000</pubDate>
		<guid isPermaLink="false">http://blog.outerthoughts.com/2008/04/bulk-converting-doc-files-into-txt-or-html/#comment-289</guid>
		<description>Once upon the time there was a pretty good tool in the installation of Google Desktop (pdf2txt, ppt2txt, etc).</description>
		<content:encoded><![CDATA[<p>Once upon the time there was a pretty good tool in the installation of Google Desktop (pdf2txt, ppt2txt, etc).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

