1. Introduction
The Lancaster Corpus of Mandarin Chinese (LCMC) is a one million-word balanced corpus that represents written Mandarin. The corpus is designed as a Chinese match for the FLOB (Hundt, Sand and Siemund 1998) and Frown (Hundt, Sand and Skandera 1999) corpora of British and American English. It was created as part of the research project “Contrastive English and Chinese”…