The software in this repository is maintained by individual users; please do not email central Computing Support if you have any queries or problems regarding the software. Instead, email the contact person in the list below. For general questions, please email Frank Keller.
This page only contains the most essential information regarding NLP and speech software under DICE. Please go to the NLP and Speech Software Wiki for local documentation and tips and tricks contributed by users of the software.
export PATH="/group/project/nlp-speech/bin:${PATH}"
export MANPATH="/group/project/nlp-speech/man:${MANPATH}"
export PERLLIB="/group/project/nlp-speech/lib/perl5/5.8.5:/group/project/nlp-speech/lib/perl5/site_perl/5.8.5:${PERLLIB}"
/group/project/nlp-speech/This directory should be mounted on all DICE machines. The directory structure is explained in more detail in the next section.
When installing software, please pay attention to the following:
bin/ binary executables (see below) doc/ documentation etc/ configuration files include/ header files lib/ library files man/ manual pages share/ shared data files src/ compiled sources (see below) pkg/ packages (pristine sources; see below)
./configure --prefix=/group/project/nlp-speech/ make make installIf the package you want to install is a Perl library, then you typically have to use:
perl Makefile.PL PREFIX=/group/project/nlp-speech/ make make install
Having the pristine sources is important if your software needs to be recompiled later on a different architecture or on a different version of DICE. Note that RPMs are the preferred way of archiving software, as we anticipate moving to an RPM-based local distribution of NLP and speech software in the next release of DICE.
In order to be able to edit this page (web/resources/nlp/index.html),
you will have to have the requisite CVS permissions. All members of the group
nlp-speech should have been given these permissions. If that's not the case, please
file a Support
Request. More information on how to use CVS can be found here.
| Package | Version | Description | Maintainer | URL | |
| CorpusWorkbench | 3.0 | Corpus query tools, incl. CQP | Central | here | |
| R | 1.9.1 | Package for statistical computing | Central | here | |
| WordNet | 1.7.1 | Lexical database | Central | here | |
| bow | 0.2 | Toolkit for statistical language modeling, text retrieval, classification and clustering | Central | here | |
| graphviz | 1.10 | Graph visualization software | Central | here | |
| netlab | 3.2 | Neural network toolkit for Matlab | Central | here | |
| NLTK | 2.0b6 | Natural Language Toolkit | Central | here | |
| pipestat | 5.4 | Comandline-based statistical analysis | Central | here | |
| splus | 7.0 | Statistical analysis package with graphical frontend | Central | here | |
| tnt | 2.2 | Thorsten Brant's Part of Speech Tagger | Central | here | |
| weka | 3.2.3 | Machine learning Algorithms in Java | Central | here |
| Package | Version | Description | Maintainer | URL | |
| Bilingual Sentence Aligner | 1.0 | Bob Moore's tool for sentence alignment in parallel bilingual corpus | Mirella Lapata | here | |
| BoosTexter | 2.1 | Classifier using boosting | Mirella Lapata | here | |
| BootCaT Toolkit | 0.1.2 | Simple Utilities for Bootstrapping Corpora and Terms from the Web | Mirella Lapata | here | |
| C4.5 | release 8, 10/95 | Classification tree generator | Mirella Lapata | here | |
| CDE | 1.0 | CCG and DRT Environment | Johan Bos | here | |
| CMU-Cambridge Toolkit | 2.05 | Statistical Language Modeling Toolkit | Frank Keller | here | |
| Cass/Scol | 1h | Steve Abney's partial parser | Mirella Lapata | here | |
| Charniak_parser | 05Mar18 | Eugene Charniak's Parser | Frank Keller | here | |
| Cluto | 2.1.1 | Clustering high-dimensional datasets | Mirella Lapata | here | |
| DBparser | 0.9.9a | Dan Bikel's Parser | Frank Keller | here | |
| ESPS | 6.0 | ESPS/waves+ with EnSig | Volker Strom | here | |
| Evalb | -- | Parser bracketing evaluation tool | Frank Keller | here | |
| Festival | 1.96 | Speech synthesis engine | Rob Clark | here | |
| German Chunker | 1.0 | Chunker for German developed by Helmut Schmid and Sabine Schulte | Mirella Lapata | here | |
| Giza++ | 2.0 | Training of statistical translation models | Mirella Lapata | here | |
| Gsearch | 2.07 | Tool for finding syntactic patterns in unparsed text | Frank Keller | here | |
| HTK | 3.4, Oct '06 | HMM Tool Kit | Volker Strom | here | |
| Infomap NLP | 0.8.5 | Latent Semantic Analysis for NLP | Mirella Lapata | here | |
| Kino | 0.6.5 | Digital video editor | Rob Clark | here | |
| LDA-C | 1.0 | C implementation of latent Dirichlet allocation | Mirella Lapata | here | |
| LP Solve | 5.5 | Mixed integer linear programming solver. | Mirella Lapata | here | |
| LT Chunk | 3.0 | LTG syntactic chunker | Mirella Lapata | here | |
| LexChainer | 1.0 | A tool to find semantically related words within unrestricted texts | Mirella Lapata | here | |
| LingPipe | 2.1.1 | Java tools for the linguistic analysis on natural language data | Mirella Lapata | here | |
| LoPar | 3.0 | Helmut Schmid's left-corner parser | Frank Keller | here | |
| MaltParser | 1.2 | Nivre's data-driven dependency parser | Ewan Klein | here | |
| Mary TTS | 3.0 | Saarland University's TTS | Volker Strom | here | |
| Megam | 0.3 | Maximum entropy model optimization package | Mirella Lapata | here | |
| Minipar | 1.0 | Dekan Lin's broad coverage parser | Mirella Lapata | here | |
| Morpha/morphg/ana | 1.0 | John Carroll's morphological tools | Frank Keller | here | |
| Mxpost | 1997 | Adwait Ratnaparkhi's POS tagger | Volker Strom | here | |
| NSP | 0.67 | Ted Pedersen's N-gram Statistics Package | Mirella Lapata | here | |
| Normalized Cut | 1.0 | Matlab code for normalized cut image segmentation | Mirella Lapata | here | |
| PDTB Tools | 1.2.3 | Tools for the Penn Discourse Treebank | Frank Keller | here | |
| Pharaoh | 1.2.3 | Beam search decoder for phrase-based statistical machine translation models | Mirella Lapata | here | |
| Praat | 4.3.24 | Analyze and synthesize speech | Volker Strom | here | |
| Primula | 1.0 | Inference with relational Bayesian networks | Mirella Lapata | here | |
| Prover9 & Mace4 | 2009-02A | automated first-order theorem prover & finite model builder | Ewan Klein | here | |
| Proximity | 4.0 | System for relational knowledge discovery | Mirella Lapata | here | |
| QuickNet | 3.11 | Tools and C++ library for using Multi-Layer Perceptrons (MLPs) | Partha Lal | here | |
| RASP | October 2002 | John Carrol and Ted Briscoe's parser | Mirella Lapata | here | |
| Ratingtest | 1.0 | Tool for web-based listening test | Volker Strom | here | |
| Rule Based Tagger | 1.14 | Eric Brill's rule based part of speech tagger | Mirella Lapata | here | |
| SNoW | 3.1 | Sparse Network of Winnows | Mirella Lapata | here | |
| SRILM | 1.4.5 | LM toolkit from SRI International | Volker Strom | here | |
| SVM light | 6.01 | Support vector machines | Mirella Lapata | here | |
| SamIam | 2.3 | Modeling and reasoning with Bayesian networks | Mirella Lapata | here | |
| SCTK | 2.1.7 | NIST's Speech Recognition Scoring Toolkit, version 2.1.7-20070222-1638 | Partha Lal | here | |
| SenseClusters | 0.71 | Cluster similar contexts together using unsupervised methods | Mirella Lapata | here | |
| Snack | 2.2.10 | Sound Toolkit for Tcl/Tk or Python | Frank Keller | here | |
| Sonic | 2.0 beta 5 | University of Colorado Speech Recognizer | Volker Strom | here | |
| Spade | 0.9 | Sentence-level parsing for discourse | Mirella Lapata | here | |
| SPRACHcore | 2004-08-26 | ICSI Speech toolkit | Partha Lal | here | |
| Text Similarity | 3.0 | Ted Pedersen's Perl package for computing text similarity measures | Frank Keller | here | |
| Tgrep | 1.14 | Treebank search tool | Frank Keller | here | |
| TigerSearch | 2.1 | Corpus search tools | Frank Keller | here | |
| Timbl | 5.1 | Tilburg Memory-based Learner | Mirella Lapata | here | |
| TinySVM | 0.09 | Support Vector Machines | Frank Keller | here | |
| TreeView | 1.0 | Tree viewing tool | Frank Keller | here | |
| WordNet QueryData | 3.0 | Jason Rennie's Perl package for querying WordNet | Frank Keller | here | |
| WordNet Similarity | 3.0 | Ted Pedersen's Perl package for computing WordNet-based similarity measures | Frank Keller | here | |
| Yale | 3.2 | Eenvironment for machine learning experiments and data mining | Frank Keller | here | |
| YamCha | 0.33 | Yet Another Multipurpose Chunk Annotator | Frank Keller | here |
|
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Logging and Cookies Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |