By arafalov, on August 15th, 2007%
When OpenNLP toolkit uses MaxEnt parser, it has to read in about 25 MBytes of model files. The model reader uses basic unbuffered FileReader. The result is the excessive number of system calls (and disk access calls) during the parser startup.
The fix is extremely simple:
In maxent-2.4.0/src/java/opennlp/maxent/io/ObjectGISModelReader.java, replace
new FileInputStream(f) with
new BufferedInputStream(new FileInputStream(f), 1000000)
Recompile maxent library
Deploy new version . . . → Read More: Reducing disk thrashing of OpenNLP/MaxEnt parser – with one line code change
By arafalov, on August 11th, 2007%
I was not able to get OpenNLP parser to work. There were no samples to play with, no command line tools to run. And I don’t even want to talk about documentation. That’s because there was not any. There was an attempt at lame joke (at least that’s the only sense I can make of what.html . . . → Read More: Getting OpenNLP parser to work
By arafalov, on August 5th, 2007%
Bikel’s statistical parser is designed to be run from the command line. I need to run it from my own code.
The following wrapper seems to do the trick on windows (with your own values for|parserdir| :
String settingsFile = “|parserdir|\\settings\\collins.properties”;
Settings.load(settingsFile);
Parser parser = new Parser(“|parserdir|\\bikel\\wsj-02-21.obj.gz”);
Sexp result = parser.parse(Sexp.read(“(This is a funny world)”).list());
There is a complaint when running the . . . → Read More: Running Bikel’s parser programmatically