This page will soon be obsolete. The new
MSTParser page is now a sourceforge project.
This new project was started by Jason Baldrige and
Ryan McDonald to make it easier for new features to be added
to the parser.
Code will be available soon. Try it out here!!
This is a simple web-page to download the implementations
of the parsers described in:
Non-Projective Dependency Parsing using Spanning Tree Algorithms
R. McDonald, F. Pereira, K. Ribarov and J. Hajic
Online Large-Margin Training of Dependency Parsers
R. McDonald, K. Crammer and F. Pereira
Online Learning of Approximate Dependency Parsing Algorithms
R. McDonald and F. Pereira
The parser is implemented in java.
New: Version 0.2 uses second-order edge features (see EACL paper above).
New: Version 0.1 has the ability to produce typed (or labeled) trees.
Please view the README file to learn about usage and input/output
Questions: ryantm at cis dot upenn dot edu
What character encoding does the parser use?
It is hard coded for Unicode (UTF8) in correspondence with the
CoNLL-X shared task. You can
grep "UTF8" and replace all occurances with whatever encoding you want.
Can the parser use CoNLL-X input format?
Not yet. However, I have include some easy to use python scripts to convert
between CoNLL and MSTParser formats. They are in the scripts directory.
Can the parser produce non-tree dependency graphs?
Not yet. This will be part of the next release.
Is the edge labeler any good?
This is somewhat complicated. The parser currently jointly predicts dependencies
and labels at once. This is nice since it allows the information from both
decisions to simultaneously be used. However, the labeler is forced to obey
any locality constraints of the dependency parser (single edge or pairs of edges).
I have found that it is often better to have a post-processing edge labeler that
can have a larger scope for features. It is not difficult to create this and
any classifier can be used. I suggest MALLET.
I will make a post-processing labeler available in the next version.