ZHOU Ai-wu, MA Na-na, LIU Hui-ting. Sentiment Text Classification Based on Chi-square Statistics[J]. Microelectronics & Computer, 2017, 34(8): 57-61.
Citation: ZHOU Ai-wu, MA Na-na, LIU Hui-ting. Sentiment Text Classification Based on Chi-square Statistics[J]. Microelectronics & Computer, 2017, 34(8): 57-61.

Sentiment Text Classification Based on Chi-square Statistics

  • Because of the short sentiment text length, the lack of information, and the sparseness of features. When use the n-gram approach, the redundancy and relevance between words are ignored. This paper proposes n-gram features selection method based on Chi-square statistics. Firstly, each feature is evaluated by taking into account the simultaneous or individual occurrence of features within the feature set. Based on the idea that the occurrence of one feature but not the other may also convey valuable information for discrimination. Then the redundancy between words is reduced by chi-square statistic algorithm calculate the relevance between features and categories. So that we can extract n-gram features of high categories relevance and low redundancy. Finally, using Support Vector Machine classifier to identify the text orientation in different corpus, the experimental results show that this method improves the accuracy of text classification.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return