基于决策树实现葡萄酒分类
将葡萄酒数据集拆分成训练集和测试集,搭建tree_1和tree_2两个决策树模型,tree_1使用信息增益作为特征选择指标,B树使用基尼指数作为特征选择指标,各自对训练集进行训练,然后分别对训练集和测试集进行预测。输出以下结果:
(1)tree_1(信息增益)在训练集上的准确率,在测试集上的准确率。
(2)tree_2(基尼指数)在训练集上的准确率,在测试集上的准确率。
源码
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifierif __name__ == "__main__":print("2 基于决策树实现葡萄酒分类")print("李思强 20201107148")wine = load_wine()x_train,x_test,y_train,y_test = train_test_split(wine.data,wine.target)print("tree_1(信息增益)")tree_1 = DecisionTreeClassifier(criterion="entropy")tree_1.fit(x_train,y_train)train_score = tree_1.score(x_train,y_train)test_score = tree_1.score(x_test,y_test)print("训练集")print("准确率:", train_score)print("测试集")print("准确率:", test_score)print("tree_2(基尼指数)")tree_2 = DecisionTreeClassifier(criterion="gini")tree_2.fit(x_train,y_train)train_score = tree_2.score(x_train,y_train)test_score = tree_2.score(x_test,y_test)print("训练集:")print("准确率:", train_score)print("测试集")print("准确率:", test_score)