常青. 基于词条时序的朴素贝叶斯垃圾邮件过滤方法[J]. 微电子学与计算机, 2010, 27(5): 212-216.
引用本文: 常青. 基于词条时序的朴素贝叶斯垃圾邮件过滤方法[J]. 微电子学与计算机, 2010, 27(5): 212-216.
CHANG Qing. Method of Spam Filtering Based on Naive Bayesian Algorithm with Token Time-Series[J]. Microelectronics & Computer, 2010, 27(5): 212-216.
Citation: CHANG Qing. Method of Spam Filtering Based on Naive Bayesian Algorithm with Token Time-Series[J]. Microelectronics & Computer, 2010, 27(5): 212-216.

基于词条时序的朴素贝叶斯垃圾邮件过滤方法

Method of Spam Filtering Based on Naive Bayesian Algorithm with Token Time-Series

  • 摘要: 朴素贝叶斯分类算法是一种有效的垃圾邮件过滤技术.互联网上的信息随着时间推移产生概念的变迁,最近出现的垃圾邮件词条可作为判定垃圾邮件的重要依据.将新近的垃圾邮件词条单独记录,在进行邮件分类时,对于最近出现的垃圾词条,提高其对垃圾邮件判定的先验概率.通过实验对比,提出的垃圾邮件过滤方法较传统的朴素贝叶斯垃圾邮件过滤具有更高的准确性、精确性和召回率.

     

    Abstract: Naive Bayesian classification algorithm is an effective spam filtering method. Information on the Internet changed over time and generated new concepts. The recent spam tokens were an important basis for spam filtering. In this approach,new spam tokens were recorded separately according to their time series and those tokens got higher prior probability in the email classification. Experimental results show that the proposed algorithm has higher accuracy,precision and recall than the traditional Nave Bayesian spam filtering method.

     

/

返回文章
返回