基于属性相似度的决策树算法:针对ID3 算法的多值偏向问题,提出一种基于属性相似度的、能够避免多值偏向问题的ID3 改进算法——NewDtree 算法,并应 用理论分析方法对NewDtree 算法不存在多值偏向问题进行了证明。通过对实验结果的分析,得出NewDtree 算法能有效地提高分类的正确 率,弥补ID3 算法选择测试属性时偏向取值较多的不足的结论。 关键词:ID3 算法;多值偏向;属性相似度;NewDtree 算法 Decision Tree Algorithm Based on Attribute Similarity LU Qiu, CHENG Xiao-hui (Department of Electronic and Computer, Guilin University of Technology, Guilin 541004) 【Abstract】According to the multivalue bios of ID3 algorithm, an ID3 improved algorithm, NewDtree, is brought forward in the paper. This algorithm is based on the attribute similarity theory, solving the multivalue bios, and the multivalue bios in NewDtree algorithm is proved not exist with the analytical method of the theory. A conclusion is drawn with the analysis of test result that the NewDtree algorithm can improve the definition of classification effectively, solving the multivalue bios problem of ID3 in selecting test attribute. 【Key words】ID3 algorithm; multivalue bios; attribute similarity; NewDtree algorithm