nmber of tokens in source number of tokens in target ratio of number of tokens in source and target number of tokens in the target / number of tokens in the source absolute difference between number of tokens and source and target normalised by source length average source token length number of mismatched brackets (opening brackets without closing brackets, and vv) number of mismatched quotation marks (opening marks without closing marks, and vv) source sentence LM probability source sentence LM perplexity source sentence LM perplexity without end of sentence marker target sentence LM probability target sentence LM perplexity target sentence LM perplexity without end of sentence marker number of occurrences of the target word within the target hypothesis (averaged for all words in the hypothesis - type/token ratio) average number of translations per source word in the sentence (threshold in giza1: prob > 0.01) average number of translations per source word in the sentence (threshold in giza1: prob > 0.05) average number of translations per source word in the sentence (threshold in giza1: prob > 0.1) average number of translations per source word in the sentence (threshold in giza1: prob > 0.2) average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) average number of translations per source word in the sentence (threshold in giza1: prob > 0.01) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.1) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.2) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza: prob > 0.01) weighted by the inverse frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza: prob > 0.05) weighted by the inverse frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza: prob > 0.1) weighted by the inverse frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza: prob > 0.2) weighted by the inverse frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza: prob > 0.5) weighted by the inverse frequency of each word in the source corpus average unigram frequency in quartile 1 of frequency (lower frequency words) in the corpus of the source language average unigram frequency in quartile 2 of frequency (lower frequency words) in the corpus of the source language average unigram frequency in quartile 3 of frequency (lower frequency words) in the corpus of the source language average unigram frequency in quartile 4 of frequency (lower frequency words) in the corpus of the source language average bigram frequency in quartile 1 of frequency (lower frequency words) in the corpus of the source language average bigram frequency in quartile 2 of frequency (lower frequency words) in the corpus of the source language average bigram frequency in quartile 3 of frequency (lower frequency words) in the corpus of the source language average bigram frequency in quartile 4 of frequency (lower frequency words) in the corpus of the source language average trigram frequency in quartile 1 of frequency (lower frequency words) in the corpus of the source language average trigram frequency in quartile 2 of frequency (lower frequency words) in the corpus of the source language average trigram frequency in quartile 3 of frequency (lower frequency words) in the corpus of the source language average trigram frequency in quartile 4 of frequency (lower frequency words) in the corpus of the source language percentage of distinct source unigrams seen in a corpus of the source language (in all quartiles) percentage of distinct source bigrams seen in a corpus of the source language (in all quartiles) percentage of distinct source trigrams seen in a corpus of the source language (in all quartiles) average word frequency: on average, each type (unigram) in the source sentence appears x times in the corpus (in all quartiles) absolute difference between number of periods in source and target sentences absolute difference between number of periods in source and target sentences normalised by target length absolute difference between number of commas in source and target sentences absolute difference between number of commas in source and target sentences normalised by target length absolute difference between number of : in source and target sentences absolute difference between number of : in source and target sentences normalised by target length absolute difference between number of ; in source and target sentences absolute difference between number of ; in source and target sentences normalised by target length absolute difference between number of ? in source and target sentences absolute difference between number of ? in source and target sentences normalised by target length absolute difference between number of ! in source and target sentences absolute difference between number of ! in source and target sentences normalised by target length percentage of punctuation marks in source sentence percentage of punctuation marks in target sentence absolute difference between number of punctuation marks between source and target sentences normalised by target length percentage of numbers in the source sentence percentage of numbers in the target sentence absolute difference between number of numbers in the source and target sentences normalised by source sentence length number tokens in the source sentence that do not contain only a-z percentage of tokens in the target sentence which do not contain only a-z ratio of percentage of tokens a-z in the source and tokens a-z in the target sentence percentage of content words in the source sentence percentage of content words in the target sentence ratio of percentage of content words in the source and target LM probability of POS tags of target sentence LM perplexity of POS tags of target sentence percentage of nouns in the source sentence percentage of verbs in the source sentence percentage of nouns in the target sentence percentage of verbs in the target sentence ratio of percentage of nouns in the source and target sentences ratio of percentage of verbs in the source and target sentences ratio of percentage of pronouns in the source and target sentences geometric mean (lambda-smoothed) of 1-to-4--gram precision scores of target translation against a pseudo-reference produced by a second MT system number of dependency relations with aligned constituents normalised by the total number of dependencies (max between source and target sentences) number of dependency relations with aligned constituents normalised by the total number of dependencies (max between source and target sentences), with the order of the constituents ignored number of dependency relations with possibly aligned constituents (using Giza's lexical table with p > 0.1) normalised by the total number of dependencies (max between source and target sentences) absolute difference between the depth of the syntactic trees of the source and target sentences number of prepositional phrases in the source sentence number of prepositional phrases in the target sentence absolute difference between the number of PP phrases in the source and target sentences absolute difference between the number of PP phrases in the source and target sentences normalised by the total number of phrasal tags in the source sentence absolute difference between the number of NPs in the source and target sentences absolute difference between the number of NP phrases in the source and target sentences normalised by the total number of phrasal tags in the skhrce sentence absolute difference between the number of VPs in the source and target sentences absolute difference between the number of VP phrases in the source and target sentences normalised by the total number of phrasal tags in the source sentence absolute difference between the number of ADJPs in the source and target sentences absolute difference between the number of ADJP phrases in the source and target sentences normalised by the total number of phrasal tags in the source sentence absolute difference between the number of ADVPs in the source and target sentences absolute difference between the number of ADVP phrases in the source and target sentences normalised by the total number of phrasal tags in the source sentence absolute difference between the number of CONJPs in the source and target sentences absolute difference between the number of CONJP phrases in the source and target sentences normalised by the total number of phrasal tags in the source sentence source probabilistic context-free grammar (PCFG) parse log-likelihood source PCFG average confidence of all possible parses in n-best list source PCFG confidence of best parse count of possible source PCFG parses target PCFG parse log-likelihood target PCFG average confidence of all possible parses in n-best list of parse trees for the sentence target PCFG confidence of best parse count of possible target PCFG parses Kullback-Leibler divergence of source and target topic distributions Jensen-Shannon divergence of source and target topic distributions source sentence intra-lingual triggers target sentence intra-lingual triggers source-target sentence inter-lingual mutual information