A treebank of Buddhist Chinese texts and its applications
Speaker:
John Lee (City University of Hong Kong)
Abstract:
This talk presents a dependency treebank of Buddhist Chinese texts, which contains over 50K characters drawn from four sutras in the Chinese Buddhist Canon. The treebank has been annotated based on the part-of-speech tagset of the Penn Chinese Treebank and the Stanford Dependencies for Chinese. We apply the treebank to explore linguistic changes in Medieval Chinese, focusing on the vernacular style and literary style as reflected in usage patterns of classifiers, demonstratives, and copulae. We also discuss the use of the treebank for profiling characters in the Canon, such as their associations with verbs and toponyms and conversational networks.