The 6 Months Rule

The six months rule says that every programmer should look at what he was doing 6 months ago and be disgusted about the way he was doing things.

If you’re a programmer and you look at your code from 6 months ago and you’re still doing the exact same thing today: Please Stop whatever you’re doing and go learn something new.

Making Coreference Resolution your bitch with OpenNLP 1.5.0

First thing’s first–what is coreference resolution?

Co-reference means that multiple expressions in a sentence or document refer to the same thing. OpenNLP contains a “linker” that analyzes the tokens of a sentences to identify which chunks of text refer to the same things (e.g., people, organizations, events).

Take, for example, the sentence “John drove to Judy’s house. He made her dinner.” In this example both John and He refer to the same entity (John); and Judy and her refer to the same, different entity (Judy). Don’t expect OpenNLP to get this 100% correct. Even a simple example like this is a difficult problem.

Picking up where I left off once upon a time (and finally wrapping up this series), here are links to the old material:

  • How to use the OpenNLP 1.5.0 Parser
  • Making Coreference Resolution your bitch with OpenNLP 1.5.0 (you’re reading it!)
  • (more…)

    Setting up Eclipse+CDT with Cygwin

    I found this post from Alex’s Tech Blog incredibly helpful when trying to set up my Windows development environment for C programming/debugging (Eclipse + CDT) with Cygwin.

    Setup Cygwin toolchain in Eclipse CDT

    Be warned that compiling with Cygwin means you are also compiling for Cygwin. If you don’t want to rely on Cygwin libraries (i.e., when deploying compiled files to another Windows computer), you’ll want to look at MinGW or MinGW-w64 instead. MinGW, however, is not fully POSIX compliant.