Sheng Li, Harbin Institute of Technology, China
Jun Li, Harbin Institute of Technology, China
Improved approach of phrase extraction was proposed for phrase-based statistical machine translation. The effectiveness was investigated when using n-best alignments instead of one-best for phrase extraction. Bilingual phrase pairs were extracted in the presented approach by combining word-to-word links from n-best alignments between source and target sentences. First, the n-best alignments were divided into hierarchies by frequencies of word co-occurrence. Second, candidates of phrase pairs were extracted from each layer. Experimental results show that the presented approach outperforms the baseline system Pharaoh in both NIST and BLEU scores. Therefore it is effective to use n-best alignments as an extension to one-best alignment for phrase extraction.
Citation:
Yong-Zeng Xue, Sheng Li, Tie-Jun Zhao, Mu-Yun Yang, Jun Li, "Bilingual Phrase Extraction from N-Best Alignments," icicic, vol. 3, pp.410-414, First International Conference on Innovative Computing, Information and Control - Volume III (ICICIC'06), 2006