使用k-近邻算法改进约会网站的配对效果(kNN)

目录

谷歌笔记本(可选)

准备数据:从文本文件中解析数据

编写算法:编写kNN算法

分析数据:使用Matplotlib创建散点图

准备数据:归一化数值

测试算法:作为完整程序验证分类器

使用算法:构建完整可用系统


谷歌笔记本(可选)


from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


准备数据:从文本文件中解析数据


def file2matrix(filename):fr = open(filename)arrayOfLines = fr.readlines()numberOfLines = len(arrayOfLines)returnMat = zeros((numberOfLines, 3))classLabelVector = []index = 0for line in arrayOfLines:line = line.strip()listFromLine = line.split('\t')returnMat[index, :] = listFromLine[0:3]classLabelVector.append(int(listFromLine[-1]))index += 1return returnMat, classLabelVector
datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')
datingDataMat

array([[4.0920000e+04, 8.3269760e+00, 9.5395200e-01], [1.4488000e+04, 7.1534690e+00, 1.6739040e+00], [2.6052000e+04, 1.4418710e+00, 8.0512400e-01], ..., [2.6575000e+04, 1.0650102e+01, 8.6662700e-01], [4.8111000e+04, 9.1345280e+00, 7.2804500e-01], [4.3757000e+04, 7.8826010e+00, 1.3324460e+00]])

datingLabels[:10]

[3, 2, 1, 1, 1, 1, 3, 3, 1, 3]


编写算法:编写kNN算法


from numpy import *
import operatordef classify0(inX, dataSet, labels, k):dataSetSize = dataSet.shape[0]diffMat = tile(inX, (dataSetSize, 1)) - dataSetsqDiffMat = diffMat ** 2sqDistances = sqDiffMat.sum(axis=1)distances = sqDistances**0.5sortedDistIndicies = distances.argsort()classCount = {}for i in range(k):voteIlabel = labels[sortedDistIndicies[i]]classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)return sortedClassCount[0][0]

分析数据:使用Matplotlib创建散点图


import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 1], datingDataMat[:, 2])
plt.show()

 

import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 1], datingDataMat[:, 2],15.0*array(datingLabels), 15.0*array(datingLabels))
plt.show()

import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 0], datingDataMat[:, 1],15.0*array(datingLabels), 15.0*array(datingLabels))
plt.show()


准备数据:归一化数值


def autoNorm(dataSet):minVals = dataSet.min(0)maxVals = dataSet.max(0)ranges = maxVals - minValsnormDataSet = zeros(shape(dataSet))m = dataSet.shape[0]normDataSet = dataSet - tile(minVals, (m,1))normDataSet = normDataSet/tile(ranges, (m,1))return normDataSet, ranges, minVals
normMat, ranges, minVals = autoNorm(datingDataMat)
normMat
array([[0.44832535, 0.39805139, 0.56233353],[0.15873259, 0.34195467, 0.98724416],[0.28542943, 0.06892523, 0.47449629],...,[0.29115949, 0.50910294, 0.51079493],[0.52711097, 0.43665451, 0.4290048 ],[0.47940793, 0.3768091 , 0.78571804]])
ranges
array([9.1273000e+04, 2.0919349e+01, 1.6943610e+00])
minVals
array([0.      , 0.      , 0.001156])

测试算法:作为完整程序验证分类器


def datingClassTest():hoRatio = 0.1datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')normMat, ranges, minVals = autoNorm(datingDataMat)m = normMat.shape[0]numTestVecs = int(m*hoRatio)errorCount = 0for i in range(numTestVecs):classifierResult = classify0(normMat[i,:], normMat[numTestVecs:m,:],datingLabels[numTestVecs:m],3)print("the classifierResult came back with: %d,\the real answer is: %d" % (classifierResult, datingLabels[i]))if (classifierResult != datingLabels[i]):errorCount += 1print("the total error rate is: %f" % (errorCount/float(numTestVecs)))

datingClassTest()
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 1
the total error rate is: 0.050000

使用算法:构建完整可用系统


def classifyPerson():resultList = ['not at all','in small doses','in large doses',]percentTats = float(input("percentage of time spent playing video games?"))ffMiles = float(input("frequent flier miles earned per year?"))iceCream = float(input("liters of ice cream consumed per year?"))datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')normMat, ranges, minVals = autoNorm(datingDataMat)inArr = array([ffMiles, percentTats, iceCream])classifierResult = classify0((inArr - minVals)/ranges, normMat, datingLabels, 3)print("You will probably like this person:", resultList[classifierResult - 1])
classifyPerson()
percentage of time spent playing video games?10
frequent flier miles earned per year?10000
liters of ice cream consumed per year?0.5
You will probably like this person: in small doses

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/485711.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

vulvhub-----Hacker-KID靶机

打靶详细教程 1.网段探测2.端口服务扫描3.目录扫描4.收集信息burp suite抓包 5.dig命令6.XXE漏洞读取.bashrc文件 7.SSTI漏洞8.提权1.查看python是否具备这个能力2.使用python执行exp.py脚本,如果提权成功,靶机则会开放5600端口 1.网段探测 ┌──(root…

C# .Net 发布后,把dll全部放在一个文件夹中,让软件目录更整洁

PublishFolderCleaner – Github 测试环境: .Net 8 Program.cs 代码 // https://github.com/dotnet-campus/dotnetcampus.DotNETBuildSDK/tree/master/PublishFolderCleanerusing System.Diagnostics; using System.Text;// 名称, 不用写 .exe var exeName "AbpDemo&…

基于Spring Boot+Mybatis+Shiro+EasyUI的宠物医院管理系统

项目介绍 本系统前台面向的用户是客户,客户可以进行预约、浏览医院发布的文章、进入医院商城为宠物购物、如有疑问可以向官方留言、还可以查看关于自己的所有记录信息,如:看病记录、预约记录、疫苗注射记录等。后台面向的用户是医院人员&…

一些可以参考的文档集合16

之前的文章集合: 一些可以参考文章集合1_xuejianxinokok的博客-CSDN博客 一些可以参考文章集合2_xuejianxinokok的博客-CSDN博客 一些可以参考的文档集合3_xuejianxinokok的博客-CSDN博客 一些可以参考的文档集合4_xuejianxinokok的博客-CSDN博客 一些可以参考的文档集合5…

个人网站如何调用微信公司的登录接口,实现微信登录

个人网站如何调用微信公司的登录接口,实现微信登录!如果你个人网站想使用微信公司的登录接口,目前是需要收费的,微信公司登录接口不再是免费的了。需要缴纳认证费,每年认证费是300元/年。才能调用微信的登录接口。 如图&#xff0…

FairyGUI × Cocos Creator 3.x 使用方式

前言 上一篇文章 FariyGUI Cocos Creator 入门 简单介绍了FairyGUI,并且按照官方demo成功在Cocos Creator2.4.0上运行起来了。 当我今天使用Creator 3.x 再引入2.x的Lib时,发现出现了报错。 这篇文章将介绍如何在Creator 3.x上使用fgui。 引入 首先&…

安泰高压放大器用途有哪些呢

高压放大器是一种功能强大的放大器,其广泛的用途使其在众多领域中发挥着重要作用。下面安泰电子将详细介绍高压放大器的用途和应用场景。 实验室仪器:在科研实验室中,高压放大器通常用于各种实验设备中,如光谱仪、质谱仪、电化学分…

【Appium UI自动化】pytest运行常见错误解决办法

通过Appium工具录制代码在pycharm上运行报错: 错误一: 1.提示 setup() 方法运行 error failed 解决办法:未创建 init __ 方法,创建一个空的__init.py文件就解决了。 原因: 错误二: 2.运行代码&#xff…

如何进行非线性负载测试?

非线性负载测试是模拟真实用户行为和系统性能的测试方法,它可以帮助我们发现系统在高并发、高负载情况下的性能瓶颈和潜在问题。以下是进行非线性负载测试的一些建议: 在进行非线性负载测试之前,首先要明确测试的目标,例如测试系统…

ESP8266智能家居(3)——单片机数据发送到mqtt服务器

1.主要思想 前期已学习如何用ESP8266连接WIFI,并发送数据到服务器。现在只需要在单片机与nodeMCU之间建立起串口通信,这样单片机就可以将传感器测到的数据:光照,温度,湿度等等传递给8266了,然后8266再对数据…

NATS学习笔记(一)

NATS是什么? NATS是一个开源的、轻量级、高性能的消息传递系统,它基于发布/订阅模式,由Apcera公司开发和维护。 NATS的功能 发布/订阅:NATS的核心是一个发布/订阅消息传递系统,允许消息生产者发布消息到特定的主题…

ONLYOFFICE 桌面应用程序 v8.0 发布:全新 RTL 界面、本地主题、Moodle 集成等你期待的功能来了!

目录 📘 前言 📟 一、什么是 ONLYOFFICE 桌面编辑器? 📟 二、ONLYOFFICE 8.0版本新增了那些特别的实用模块? 2.1. 可填写的 PDF 表单 2.2. 双向文本 2.3. 电子表格中的新增功能 单变量求解:…