QuEst - an open source tool for translation quality estimation
This software has two main modules: a Java module to extract a number of sentence-level features (and a few word-level features) and a python module that interacts with the scikit-learn toolkit for machine learning. Is has also a few python and shell scripts for small things here and there.
See QuEst documentation.
The license for our Java code is BSD. For pre-existing code and resources, e.g., scikit-learn, SRILM, GIZA++, and Berkeley parser, please check their website.
This software was developed as part of the QUEST, QTLaunchPad and QT21 projects
Versions of the software
Get QuEst++, an extended version of QuEst with word and document-level features, as described in our ACL demo paper.
Get vanilla version of the source code for QuEst++.
Get the "online" version of the baseline QuEst (17 features only) which pre-loads resources in memory for better efficiency when processing sentence by sentence
Get the installation script that will download the stable version of the source code, a built up version (jar), plus all necessary linguistic processors (parsers, etc). To run it on Linux/Mac: ./install.sh
Get a stable version of the above source code only (without the linguistic processors)
Get a vanilla version of the source code which is easier to run (and re-build), as it relies on fewer pre-processing resources/tools. Toy resources for en-es are also included in this version. It only extracts up to 50 features
Check development version of the code on GitHub; send me an email if you want to be a collaborator
Check the current baseline, black-box, and glass-box lists of features QuEst can extract.
We provide also some resources for the WMT shared task datasets, e.g. WMT13.
Lucia Specia - University of Sheffield