I have been using Stanford NLP Parser from command line with -tagSeparator flag to supply it with partially tagged input. As the parser seems to be really bad with date expressions and complex name entities, I need this functionality.
Now, I need to wrap-up the parser in my own code to add input/output batching and I [...]
Continue reading about Duplicating -tagSeparator effect when using Stanford Parser programmatically
This was the fastest beta invite confirmation ever. Unfortunately, Digger’s Terms of Service do not allow any sort of disclosure about features or results from it. This is very different from Powerset which has been going out of its way to get beta subscribers (even unconfirmed ones) to know what they are doing. Digger does [...]
Powerset hasn’t even started competing with Google yet and already it has its own competitor.
Digger - which is currently in private beta - does sense disambiguation of the search terms like everybody else. Unlike everybody else, however, they expose the underlying WordNet definitions to the searcher and allow them to pick, rate and even discuss [...]
Continue reading about Digger - Another NLP enhanced search engine (beta)
I found another online syntax tree visualiser that can cope with large trees - phpSyntaxTree. It requires square brackets instead of the lisp s-expression ones, but it should not be too hard to convert from one to another. There is also a Ruby version of the application from a different developer, but it refused to [...]
Are you a regular yet not subscribed reader of this blog? Would you like to subscribe, but haven’t figured out how? I apologise.
I have received an email asking how to subscribe to my blog via email. That made me think about subscribers in general and that maybe some of you may be stuck reading this [...]
In my review of WordChamp and LingQ I mentioned that an ideal language learning system would have deep support for the specifics of the learner’s target language. I was asked to clarify what I mean by that.
I have now found an example of what could be a step in the right direction. It is an [...]
Continue reading about Learning english prepositions - the smart way
Two books, two views - no agreement, but certainly a lot of sparks. Is the Internet full of junk and by killing off the conventional media we are loosing all our good information sources? That is a point of view of Andrew Keen, author of the book Cult of the Amateur. On the other [...]
Continue reading about Is the Internet good, bad or bits of everything? - Weinberger/Keen debates
Just a link to an interesting article by Sunayana on Natural Language Processing as applied to problems in India.
She has an interesting point that because NLP is so underdeveloped in India, even undergraduate-level projects may be contributing to the cutting edge of research.
This is similar to what was mentioned in the podcast about Somali speech [...]
Somewhere between upgrading WordPress and (possibly) upgrading MySql database, all the non latin characters got corrupted. This included my Russian, Spanish, French and possibly Esperanto writing. Basically, anything in UTF-8 that is not in latin-1.
The partial solution was go into database administration and changing the text columns collation order to Unicode. Unfortunately, that only fixed [...]
I have tried Tamarind before as an ingredient in dishes, but I have never actually seen the real fruit. I wasn’t even sure it was edible uncooked. So, when I saw it sold in the shop, I had to try it. It turned out to be a very educational experience. The fruit is layered with [...]