Tuesday, May 20, 2008

Python tool: Chinese Treebank

I have put some scripts to process the Penn Chinese Treebank to
Google code. These files include a parser to turn bracketed annotations into Python objects, a converter to translate POS tags into the Stanford tagger format, and a set of head finding rules to translate CTB into dependency trees.

I haven't made any releases for download, but have been updating the source code. The files are available by browsing the trunk from the SVN repository.

No comments: