By Alexandre Rafalovitch, on January 11th, 2013 I knew I was neglecting my blog in 2012, but I did not realize just how much until I received WordPress’ year in review for 2012 (Feel free to take a peek at it). The line that stopped me dead was “In 2012, there was 1 new post”. Sure enough – one post it was.
Well, this . . . → Read More: Oops: there goes the blog in 2012
By Alexandre Rafalovitch, on March 26th, 2009 A recent article on lingpipe discussed conjuncted named entities such as Johnson and Johnson and Wallace and Gromit. They suggest that maybe a way of treating this is as a frozen expression. I assume that means relying on statistical measures to see this Multi-Word-Expression repeating enough times to be treated as a unit.
In the . . . → Read More: Conjunctions in named entities
By Alexandre Rafalovitch, on January 25th, 2009 I am collecting my reading and reference material in CiteULike. I like the service because it can capture details from multiple sources. It also allows to discover what was collected by other interesting people through tags, people and bookmarks graph navigation.
Nice as CiteULike is, it is fairly difficult to get an overall picture of . . . → Read More: Visualizing CiteULike collections
By Alexandre Rafalovitch, on January 17th, 2009 Dr. René Witte has just created a new mailing list (SENLP) to discuss applying NLP techniques to Software Engineering and also to discuss general Software Engineering issues in developing NLP systems.
I am interested in both topics. I did 3 years as senior technical support at BEA and could see how applying NLP techniques on . . . → Read More: New mailing list to discuss junction of NLP and Software Engineering
By Alexandre Rafalovitch, on January 13th, 2009 I am frustrated. I know my corpus (resolutions of the United Nations General Assembly) shares a lot in common with biomedical and legal domain. And I can find interesting articles in biomedical domain dealing with similar issues of complex tokenization, long named entity mentions (though mine are much longer), etc. But I see nothing in . . . → Read More: Where are all legal computational linguistics resources?
By Alexandre Rafalovitch, on August 25th, 2008 It is hard enough to explain what we are doing to our professors; explaining it in plain English to our friends and family is nearly impossible.
So it is always good to see people who can explain what POS tagger is and why it is important without having to throw around references to Norvig or . . . → Read More: Explaining Computational Linguistics to friends and family
By Alexandre Rafalovitch, on April 19th, 2008 I have written about converting Microsoft Word files into text or html using OpenOffice before. However, the wizards I described in that article were crashing when the number of files crossed into several hundreds.
I have written some macros to do the conversion, but they were scary looking and fragile. Fortunately, I now found a . . . → Read More: Bulk converting doc files into txt (or html)
By Alexandre Rafalovitch, on January 24th, 2008 While reading weka Data Mining book, I have come across this impressive example of using machine learning to confirm person’s authorship (p. 358).
In 19th century, there lived a famous rabbinic scholar Ben Ish Chai, who among other writings had two collections of letters. Ben Ish Chai claimed that only one collection was his and . . . → Read More: On uselessness of pretending to be somebody else
By Alexandre Rafalovitch, on December 1st, 2007 What could be common between Computational Linguistics and Aerobics? Quite a lot, as it turns out to be.
Dance descriptions, while not really in English do have a regular structure and can be thought of as a sub-language with full set of syntactic, semantic and pragmatic levels.
There are basic words of the language (move . . . → Read More: Parsing jumping jacks
By Alexandre Rafalovitch, on October 6th, 2007 From time to time I experiment with GATE NLP toolkit. Just now I tried to upgrade to the latest version (version 4) and run into really strange problem with ANNIE system not loading correctly. Later, when I uninstalled older GATE version, it stopped loading at all.
The problem is the user configuration file gate.xml that . . . → Read More: Upgrading to GATE 4? Beware of leftover configuration files.
|
|