题目
题目链接
分类模型的性能指标包括准确率(accuracy)、精确率(precision)、召回率(recall)、F1分数(F1_score)、ROC曲线(ROC_curve)、AUC(AUC)等。本题具体使用的性能指标如下
- 混淆矩阵(Confusion Matrix):
\[\begin{bmatrix}
TP & FN \\
FP & TN
\end{bmatrix}
\]
其中,TP是真阳性,TN是真阴性,FP是假阳性,FN是假阴性。
\[TP = \text{真正例(True Positive)} = \text{被正确分类为正类的样本数}
\]
\[TN = \text{真负例(True Negative)} = \text{被正确分类为负类的样本数}
\]
\[FP = \text{假阳性(False Positive)} = \text{被错误分类为正类的负类样本数}
\]
\[FN = \text{假阴性(False Negative)} = \text{被错误分类为负类的正类样本数}
\]
- 准确率(Accuracy):
\[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}
\]
- F1分数(F1 Score):
\[F1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}
\]
其中,Precision = \(\frac{TP}{TP + FP}\),Recall = \(\frac{TP}{TP + FN}\)。
- 特异度(Specificity):
\[Specificity = \frac{TN}{TN + FP}
\]
- 负预测值(Negative Predictive Value):
\[Negative\ Predictive\ Value = \frac{TN}{TN + FN}
\]
标准代码如下
from collections import Counter
def performance_metrics(actual: list[int], predicted: list[int]) -> tuple:data = list(zip(actual, predicted))counts = Counter(tuple(pair) for pair in data)TP, FN, FP, TN = counts[(1, 1)], counts[(1, 0)], counts[(0, 1)], counts[(0, 0)]confusion_matrix = [[TP, FN], [FP, TN]]accuracy = (TP + TN) / (TP + TN + FP + FN)precision = TP / (TP + FP)recall = TP / (TP + FN)f1 = 2 * precision * recall / (precision + recall)negativePredictive = TN / (TN + FN)specificity = TN / (TN + FP)return confusion_matrix, round(accuracy, 3), round(f1, 3), round(specificity, 3), round(negativePredictive, 3)