This is just where I'm storing random code and commands for conducting basic text analysis on web archives in a variety of formats. Much removed as has been replaced by Python scripts (made public soon, still under development).
Right now, a place to store some NER scripts (a la http://williamjturkel.net/2013/06/30/named-entity-recognition-with-command-line-tools-in-linux/).