Categories

  • Classifier
  • Corpus
  • Mandarin

Tags

  • Classifier
  • Corpus

ClassifierGuesser: A context-based classifier prediction system for Chinese language learners (Peinelt et al., 2017)

Citation

Peinelt, N., Liakata, M., & Hsieh, S. K. (2017, November). ClassifierGuesser: A context-based classifier prediction system for chinese language learners. In Proceedings of the IJCNLP 2017, System Demonstrations (pp. 41-44).

My thoughts

  • They mentioned databases with semantic features of Chinese classifiers (Gao, 2011) -> Helena Hong Gao. 2011. E-learning design for Chinese classifiers: Reclassification. Communications in Computer and Information Science, 177:186–199
    • I need to check it out
  • 突然发现我能不能use classifier to predict adjective, wang, et al., 2023 found for models, it’s beneficial to have classifiers without classifiers for both verbs and nouns, and they used Word-level BERT (wobert_chinese_base) to predict subsequent nouns and preceding verbs in 2 environments (one with classifiers and one without classifiers)

Introduction

  • Languages such as Chinese are characterized by the existence of a class of words commonly referred to as ‘classifiers’ or ‘measure words’.
  • Syntax

    • Classifiers are the obligatory component of a quantifier phrase, which is contained in a noun phrase or verb phrase.
  • Semantics

    • Classifier modifies the quantity or frequency of its head word and requires a certain degree of shared properties between the classifier and the head
  • Event classifier: 场
  • Context is an important factor for classifier selection, such as 一颗球 vs 一场精彩的球

Corpus

CCD corpus (ChineseClassifierDataset)

  • based oon three openly available POS tagged Chinese corpora:
    • The Lancaster Corpus of Mandarin Chinese (McEnery and Xiao, 2004)
    • the UCLA Corpus of Written Chinese (Tao and Xiao, 2012)
    • the Leiden Weibo Corpus (van Esch, 2012)