MaltParser 0.2: engMalt

The archive engMalt contains the files necessary to create a running parser for English text together with MaltParser 0.2:
eng/option.dat
eng/english.pos
eng/english.dep
eng/english.par
eng/english.par.mbl.mod
eng/english_input.tab
eng/english_output.tab
The parsing model (eng/english.par.mbl.mod) has been trained (using the feature model specified in eng/english.par) on sections 0-22 of the Wall Street Journal section of the Penn Treebank, converted to dependency trees using the head percolation table of Yamada and Matsumoto (2003) and the dependency type rules of Nivre (2005). The parser presupposes that the input is in Malt-TAB and tagged with the Penn Treebank part-of-speech tagset. (The part-of-speech tagset can be found in eng/english.pos and the dependency type set in eng/english.dep.)

Running engMalt

Before running the parser, you need to download and unpack MaltParser 0.2. Then download engMalt.tar.gz into the directory containing the executable file maltparser* and unpack it:

> gunzip engMalt.tar.gz
> tar xvf engMalt.tar

This will create a directory eng containing all the engMalt files. Run the parser with the following command:

> ./maltparser -f eng/option.dat

This parses the test file eng/english_input.tab and stores the result in eng/english_output.tab. To change input or output file, simply edit the file eng/option.dat and change the value of the parameters $INFILE$ and $OUTFILE$. To change the output format from Malt-TAB to Malt-XML (or TIGER-XML), change the value of the parameter $OUTFORMAT$ from TAB to MALTXML (or TIGERXML).

NB: In principle, any of the options in the option file can be changed, although we cannot guarantee how the parser behaves. In particular, changing the value of the parameters $ALGORITHM$, $FEATURES$ and $LEARNER$ (without retraining the parsing model accordingly) will make the parser either crash or produce garbage.