【验证码系列】用逆向思维深度分析滑动验证码（含轨迹算法）-编程知识

文章目录

1. 写在前面
2. 抓包分析
3. 接口分析
4. 滑动验证码弹出分析
5. 滑动验证分析
6. 轨迹生成算法实现
7. 生成W参数值算法

1. 写在前面

验证码是机器人防护（即爬虫）常用重要手段之一！在爬虫这个领域内专精某一项（验证码识别、JS或者APP逆向）都是可以成为大牛存在的

传统的验证码我们都知道是静态图片的形式，它们包含一些阿拉伯数字跟字母加一些干扰（现在而言的话难度系数极低）。目前高级且复杂的验证码那是五花八门，有成熟方案的厂商也是非常之多

在这里插入图片描述

如上图所示，是使用ChatGPT时弹出的验证码！很有创意～

本次主要讲一下滑块验证码，因为目前这类验证码在大部分网站中还是比较常见的，如下所示：

极验、易盾、数美…

分析目标：

aHR0cHM6Ly93d3cuZ2VldGVzdC5jb20vZGVtby9zbGlkZS1mbG9hdC5odG1s

滑动验证码的轨迹加密，是逆向的最终目的！所有厂商验证码都是以下步骤

2. 抓包分析

首先我们打开网站抓包分析，分析各个请求。现在我们来分析下面三个可疑的接口

在这里插入图片描述

3. 接口分析

第一个URL（register-slide?t=1698396900148）带着时间戳，请求滑动验证码并返回一些重要数据，内容如下所示：

// 响应
{"success": 1,"challenge": "a2243f6f4343698c34cbe0794553e82b","gt": "019924a82c70bb123aae90d483087f94","new_captcha": true
}

这里有两个重要值challenge和gt，这两个值都是加密的，具体作用我们不知，接着向下分析

第二个URL（gettype.php）猜测是获取滑动验证类型的，因为极验有很多种验证码类型。内容如下所示：

// 响应
geetest_1698396906979({"status": "success","data": {"type": "fullpage","static_servers": ["static.geetest.com/", "dn-staticdown.qbox.me/"],"voice": "/static/js/voice.1.2.3.js","click": "/static/js/click.3.0.9.js","beeline": "/static/js/beeline.1.0.1.js","fullpage": "/static/js/fullpage.9.1.8-bfget5.js","slide": "/static/js/slide.7.9.0.js","geetest": "/static/js/geetest.6.0.9.js","aspect_radio": {"slide": 103,"click": 128,"voice": 128,"beeline": 50}}
})

即返回很多验证码用到的JS，其中fullpage.9.1.8.js就是滑动验证码JS，还有与之相关的JS文件

另外第二个URL请求的参数有gt参数，正是第一个URL响应返回的

在这里插入图片描述

第三个URL（get.php）请求参数包括gt、w和challenge。内容如下所示：
在这里插入图片描述

其中gt、challenge均可以通过第一个URL获取拿到。这里说一下w参数，加密的一大串字符串猜测有可能是轨迹、环境检测、参数最终计算生成的

//响应
geetest_1698396904448({"status": "success","data": {"theme": "wind","theme_version": "1.5.8","static_servers": ["static.geetest.com", "dn-staticdown.qbox.me"],"api_server": "api.geetest.com","logo": true,"feedback": "https://www.geetest.com/contact#report","c": [12, 58, 98, 36, 43, 95, 62, 15, 12],"s": "2c506476","i18n_labels": {"copyright": "\u7531\u6781\u9a8c\u63d0\u4f9b\u6280\u672f\u652f\u6301","error": "\u7f51\u7edc\u4e0d\u7ed9\u529b","error_content": "\u8bf7\u70b9\u51fb\u6b64\u5904\u91cd\u8bd5","error_title": "\u7f51\u7edc\u8d85\u65f6","fullpage": "\u667a\u80fd\u68c0\u6d4b\u4e2d","goto_cancel": "\u53d6\u6d88","goto_confirm": "\u524d\u5f80","goto_homepage": "\u662f\u5426\u524d\u5f80\u9a8c\u8bc1\u670d\u52a1Geetest\u5b98\u7f51","loading_content": "\u667a\u80fd\u9a8c\u8bc1\u68c0\u6d4b\u4e2d","next": "\u6b63\u5728\u52a0\u8f7d\u9a8c\u8bc1","next_ready": "\u8bf7\u5b8c\u6210\u9a8c\u8bc1","read_reversed": false,"ready": "\u70b9\u51fb\u6309\u94ae\u8fdb\u884c\u9a8c\u8bc1","refresh_page": "\u9875\u9762\u51fa\u73b0\u9519\u8bef\u5566\uff01\u8981\u7ee7\u7eed\u64cd\u4f5c\uff0c\u8bf7\u5237\u65b0\u6b64\u9875\u9762","reset": "\u8bf7\u70b9\u51fb\u91cd\u8bd5","success": "\u9a8c\u8bc1\u6210\u529f","success_title": "\u901a\u8fc7\u9a8c\u8bc1"}}
})

上响应面重要的可能就是s和c的值

4. 滑动验证码弹出分析

接下来请空之前的请求信息，打开滑块验证码，点击验证码按钮，弹出如下：
在这里插入图片描述

我们按照上图继续简单分析一下请求的URL接口，第一个URL（ajax.php），它的请求信息如下所示：

在这里插入图片描述

跟上面的基本一样gt、challenge、w。这里都按照前面的接口拿需要的参数值即可，唯一不同此时w值会变化，因为我们点击滑动验证码弹出一个验证码来，推测w值有轨迹信息以及一些检验

//响应
geetest_1698400310968({"status": "success", "data": {"result": "slide"}})

点击弹出滑动验证码进行了验证，成功后给出成功的结果！

第二个URL（get.php），它的请求信息跟上面一样带着gt、challenge，不过没有w。这里我们直接看返回的响应：

//响应
geetest_1698400310948({"gt": "019924a82c70bb123aae90d483087f94","challenge": "c8c2b2ded3230049af7906ffb5344494gz","id": "ac8c2b2ded3230049af7906ffb5344494","bg": "pictures/gt/cd0bbb6fe/bg/a1152167f.jpg","fullbg": "pictures/gt/cd0bbb6fe/cd0bbb6fe.jpg","link": "","ypos": 39,"xpos": 0,"height": 160,"slice": "pictures/gt/cd0bbb6fe/slice/a1152167f.png","api_server": "https://api.geetest.com","static_servers": ["static.geetest.com/", "dn-staticdown.qbox.me/"],"mobile": true,"theme": "ant","theme_version": "1.2.6","template": "","logo": true,"clean": false,"type": "multilink","fullpage": false,"feedback": "https://www.geetest.com/contact#report","show_delay": 250,"hide_delay": 800,"benchmark": false,"version": "6.0.9","product": "embed","https": true,"width": "100%","show_voice": true,"c": [12, 58, 98, 36, 43, 95, 62, 15, 12],"s": "62542a52","so": 0,"i18n_labels": {"cancel": "\u53d6\u6d88","close": "\u5173\u95ed\u9a8c\u8bc1","error": "\u8bf7\u91cd\u8bd5","fail": "\u8bf7\u6b63\u786e\u62fc\u5408\u56fe\u50cf","feedback": "\u5e2e\u52a9\u53cd\u9988","forbidden": "\u602a\u7269\u5403\u4e86\u62fc\u56fe\uff0c\u8bf7\u91cd\u8bd5","loading": "\u52a0\u8f7d\u4e2d...","logo": "\u7531\u6781\u9a8c\u63d0\u4f9b\u6280\u672f\u652f\u6301","read_reversed": false,"refresh": "\u5237\u65b0\u9a8c\u8bc1","slide": "\u62d6\u52a8\u6ed1\u5757\u5b8c\u6210\u62fc\u56fe","success": "sec \u79d2\u7684\u901f\u5ea6\u8d85\u8fc7 score% \u7684\u7528\u6237","tip": "\u8bf7\u5b8c\u6210\u4e0b\u65b9\u9a8c\u8bc1","voice": "\u89c6\u89c9\u969c\u788d"},"gct_path": "/static/js/gct.b71a9027509bc6bcfef9fc6a196424f5.js"
})

返回的滑块验证码的图片等一些数据

5. 滑动验证分析

接上面我们开始滑动滑块验证，拖动滑块到缺口处现实验证成功即可，该动作也就是我们之后要模拟的轨迹
在这里插入图片描述

携带的参数还是以前生成的gt、challenge，以及新的w值，每一次的w都是变化的，由此可以推测出w值确实包含着滑动的轨迹。如下图所示：

在这里插入图片描述

接口返回的则是成功验证后的信息，如下所示：

//响应
geetest_1698401308875({"success": 1,"message": "success","validate": "57aa8ef66d49966e169d10b8eef6886e","score": "1"
})

6. 轨迹生成算法实现

import math
import random
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mplclass GTrace:def __init__(self):self.__pos_x = []self.__pos_y = []self.__pos_z = []def __set_pt_time(self):"""设置各节点的时间分析不同时间间隔中X坐标数量的占比统计结果: 1. 80%~90%的X坐标在15~20毫秒之间2. 10%~15%在20~200及以上，其中 [-a, 0, x, ...] 这里x只有一个，取值在110~200之间坐标集最后3~5个坐标取值再50~400之间，最后一个坐标数值最大滑动总时间的取值规则: 图片宽度260，去掉滑块的宽度剩下200;如果距离小于100，则耗时1300~1900之间如果距离大于100，则耗时1700~2100之间"""__end_pt_time = []__move_pt_time = []self.__pos_z = []total_move_time = self.__need_time * random.uniform(0.8, 0.9)start_point_time = 0__start_pt_time = [0, 0, int(start_point_time)]sum_move_time = 0_tmp_total_move_time = total_move_timewhile True:delta_time = random.uniform(15, 20)if _tmp_total_move_time < delta_time:breaksum_move_time += delta_time_tmp_total_move_time -= delta_time__move_pt_time.append(int(start_point_time+sum_move_time))last_pt_time = __move_pt_time[-1]__move_pt_time.append(last_pt_time+_tmp_total_move_time)sum_end_time = start_point_time + total_move_timeother_point_time = self.__need_time - sum_end_timeend_first_ptime = other_point_time / 2while True:delta_time = random.uniform(110, 200)if end_first_ptime - delta_time <= 0:breakend_first_ptime -= delta_timesum_end_time += delta_time__end_pt_time.append(int(sum_end_time))__end_pt_time.append(int(sum_end_time + (other_point_time/2 + end_first_ptime)))# self.__pos_z.extend(__start_pt_time)self.__pos_z.extend(__move_pt_time)self.__pos_z.extend(__end_pt_time)def __set_distance(self, _dist):"""设置要生成的轨迹长度"""self.__distance = _distif _dist < 100:self.__need_time = int(random.uniform(500, 1500))else:self.__need_time = int(random.uniform(1000, 2000))def __get_pos_z(self):return self.__pos_zdef __get_pos_y(self):_pos_y = []point_count = len(self.__pos_z)x = np.linspace(-10, 15, point_count - len(_pos_y))arct_y = np.arctan(x)for _, val in enumerate(arct_y):_pos_y.append(val)return _pos_ydef __get_pos_x(self, _distance):"""绘制标准的数学函数图像: 以 tanh 开始 以 arctan 结尾根据此模型用等比时间差生成X坐标"""# first_val = random.uniform(-40, -18)# _distance += first_val# _pos_x = [random.uniform(-40, -18), 0]_pos_x = []self.__set_distance(_distance)self.__set_pt_time()point_count = len(self.__pos_z)x = np.linspace(-1, 19, point_count-len(_pos_x))ss = np.arctan(x)th = np.tanh(x)for idx in range(0, len(th)):if th[idx] < ss[idx]:th[idx] = ss[idx]th += 1th *= (_distance / 2.5)i = 0start_idx = int(point_count/10)end_idx = int(point_count/50)delta_pt = abs(np.random.normal(scale=1.1, size=point_count-start_idx-end_idx))for idx in range(start_idx, point_count):if idx*1.3 > len(delta_pt):breakth[idx] += delta_pt[i]i+=1_pos_x.extend(th)return _pos_x[-1], _pos_xdef get_mouse_pos_path(self, distance):"""获取滑动滑块鼠标的滑动轨迹坐标集合"""result = []_distance, x = self.__get_pos_x(distance)y = self.__get_pos_y()z = self.__get_pos_z()for idx in range(len(x)):result.append([int(x[idx]), int(y[idx]), int(z[idx])])return int(_distance), resultif __name__ == "__main__":_color = ["blue", "green", "red", "cyan", "magenta"]trace = GTrace()distance , res = trace.get_mouse_pos_path(100)print(res)

7. 生成W参数值算法

w参数值生成需要关键参数如下：

c、s、gt、challenge、轨迹、滑动时间、滑动距离

代码比较多也比较复杂，这里直接展示最终的参数生成代码，只需传入对应的参数即可，如下图所示：

function generate_w(params){/*** 生成w, params需要传入distance, passtime, track, c, s, gt, challenge*/var u = new U()["encrypt"](rt_)  // 这里的rt是一个随机值，但是需要和下面保持一致o = {"lang": "zh-cn","userresponse": H(params['distance'], params['challenge']),  // 把滑动距离和challenge传入H函数，得到userresponse"passtime": params['passtime'],  // 滑动时间，要和轨迹里的滑动时间一致"imgload": 155,  // 加载时间，可以是随机值"aa": sign_aaa(params['track'], params['c'], params['s']),  // 传入轨迹、c、s，生成aa"ep": sign_ep(),  // 一些版本号和时间等信息"h9s9": "1816378497",  // 固定值"rp": md5Hash(params['gt'] + params['challenge'].slice(0, 32) + params['passtime'])  // 传入gt、challenge、passtime，md5生成rp}var l = V['encrypt'](gt['stringify'](o), rt_)var h = m["$_FCp"](l)return h + u
}