percentage of numbers in the source percentage of numbers in the target absolute difference between number of numbers in the source and target sentence normalised by source sentence length percentage of source words that contain non-alphabetic symbols percentage of target words that contain non-alphabetic symbols ratio of percentage of tokens a-z in the source and tokens a-z in the target average unigram frequency in quartile 1 of frequency in the corpus of the source language average unigram frequency in quartile 2 of frequency in the corpus of the source language average unigram frequency in quartile 3 of frequency in the corpus of the source language average unigram frequency in quartile 4 of frequency in the corpus of the source language average bigram frequency in quartile 1 of frequency in the corpus of the source language average bigram frequency in quartile 2 of frequency in the corpus of the source language average bigram frequency in quartile 3 of frequency in the corpus of the source language average bigram frequency in quartile 4 of frequency in the corpus of the source language average trigram frequency in quartile 1 of frequency in the corpus of the source language average trigram frequency in quartile 2 of frequency in the corpus of the source language average trigram frequency in quartile 3 of frequency in the corpus of the source language average trigram frequency in quartile 4 of frequency in the corpus of the source language percentage of distinct source unigrams seen in a corpus of the source language (in all quartiles) percentage of distinct source bigrams seen in a corpus of the source language (in all quartiles) percentage of distinct source trigrams seen in a corpus of the source language (in all quartiles) target sentence LM probability target sentence LM perplexity average number of translations per source word in the sentence (threshold in giza1: prob > 0.01) average number of translations per source word in the sentence (threshold in giza1: prob > 0.05) average number of translations per source word in the sentence (threshold in giza1: prob > 0.1) average number of translations per source word in the sentence (threshold in giza1: prob > 0.2) average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) average number of translations per source word in the sentence (threshold in giza1: prob > 0.01) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.1) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.2) weighted by the frequency of each word in the source corpus average number of translations per source word in the sentence (threshold in giza1: prob > 0.5) weighted by the frequency of each word in the source corpus absolute difference between number of periods in source and target sentences absolute difference between number of commas in source and target sentences absolute difference between number of : in source and target sentences absolute difference between number of ; in source and target sentences absolute difference between number of ? in source and target sentences absolute difference between number of ! in source and target sentences absolute difference between number of periods in source and target sentences normalised by target length absolute difference between number of commas in source and target sentences normalised by target length absolute difference between number of : in source and target sentences normalised by target length absolute difference between number of ; in source and target sentences normalised by target length absolute difference between number of ? in source and target sentences normalised by target length absolute difference between number of ! in source and target sentences normalised by target length percentage of punctuation marks in source sentence percentage of punctuation marks in target sentence absolute difference between number of punctuation marks between source and target sentences normalised by target length source sentence LM probability source sentence LM perplexity nmber of tokens in target phrase number of tokens in source phrase number of tokens in the source / number of tokens in the target number of tokens in the target / number of tokens in the source average target token length average source token length average number of occurrences of the target word within the target sentence percentage of unaligned words percentage of words with more than 1 aligned words average number of aligned words per word percentage of content words in the source sentence percentage of content words in the target sentence percentage of verbs in the source sentence percentage of verbs in the target sentence percentage of nouns in the source sentence percentage of nouns in the target sentence percentage of pronouns in the source sentence percentage of pronouns in the target sentence ratio of percentage of content words in the source and target ratio of percentage of nouns in the source and target sentences ratio of percentage of verbs in the source and target sentences ratio of percentage of pronouns in the source and target sentences