Analysis and Prediction of Unalignable Words in Parallel Text
Published in EACL, 2014
Frances Yung, Kevin Duh, Yuji Matsumoto
Professional human translators usually do not employ the concept of word alignments, producing translations ‘sense-forsense’ instead of ‘word-for-word’. This suggests that unalignable words may be prevalent in the parallel text used for machine translation (MT). We analyze this phenomenonin-depth for Chinese-English translation. We further propose a simple and effective method to improve automatic word alignment by pre-removing unalignable words, and show improvements on hierarchical MT systems in both translation directions.