基于XGBoots预测A股大盘《上证指数》(代码+数据+一键可运行)

 对AI炒股感兴趣的小伙伴可加WX:caihaihua057200(备注:学校/公司+名字+方向)

另外我还有些AI的应用可以一起研究(我一直开源代码)

1、引言

在这期内容中,我们回到AI预测股票,转而探索人工智能技术如何应用于另一个有趣的领域:预测A股大盘。

2、AI与股票的关系

在股票预测中,AI充当着数据分析和模式识别的角色。虽然无法确保百分之百准确的结果,但它为增加预测的洞察力和理解提供了全新的途径。

3、数据收集与处理(akshare爬实时上证指数)

import akshare as ak
import numpy as np
import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
import xgboost as xgbdf = ak.stock_zh_index_daily_em(symbol='sh000001')  

数据预处理:时间特征转换及时间特征结合K线特征


today = datetime.today()
date_str = today.strftime("%Y%m%d")
base = int(datetime.strptime(date_str, "%Y%m%d").timestamp())
change1 = lambda x: (int(datetime.strptime(x, "%Y%m%d").timestamp()) - base) / 86400
change2 = lambda x: (datetime.strptime(str(x), "%Y%m%d")).day
change3 = lambda x: datetime.strptime(str(x), "%Y%m%d").weekday()df['date'] = df['date'].str.replace('-', '')
X = df['date'].apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day = df['date'].apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day = df['date'].apply(lambda x: change3(x)).values.reshape(-1, 1)
XX = np.concatenate((X, X_week_day, X_month_day), axis=1)[29:]
FT = np.array(df.drop(columns=['date']))
min_vals = np.min(FT, axis=0)
max_vals = np.max(FT, axis=0)
FT = (FT - min_vals) / (max_vals - min_vals)window_size = 30
num_rows, num_columns = FT.shape
new_num_rows = num_rows - window_size + 1
result1 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.mean(window, axis=0)result1[i] = window_meanresult2 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.max(window, axis=0)result2[i] = window_meanresult3 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.min(window, axis=0)result3[i] = window_meanresult4 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.std(window, axis=0)result4[i] = window_mean
result_list = [result1, result2, result3, result4]
result = np.hstack(result_list)XX = np.concatenate((XX, result), axis=1)

4、预测模型(XGboots)


y1 = df['open'][29:]
y2 = df['close'][29:]
y3 = df['high'][29:]
y4 = df['low'][29:]
models1 = xgb.XGBRegressor()
models2 = xgb.XGBRegressor()
models3 = xgb.XGBRegressor()
models4 = xgb.XGBRegressor()
models1.fit(XX, y1)
models2.fit(XX, y2)
models3.fit(XX, y3)
models4.fit(XX, y4)

5、应用及画图


start_date = pd.to_datetime(today)bday_cn = CustomBusinessDay(weekmask='Mon Tue Wed Thu Fri')
future_dates = pd.date_range(start=start_date, periods=6, freq=bday_cn)
future_dates_str = [date.strftime('%Y-%m-%d') for date in future_dates][1:]
future_dates_str = pd.Series(future_dates_str).str.replace('-', '')
X_x = future_dates_str.apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day_x = future_dates_str.apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day_x = future_dates_str.apply(lambda x: change3(x)).values.reshape(-1, 1)
XXX = np.concatenate((X_x, X_week_day_x, X_month_day_x), axis=1)
last_column = result[-1:, ]
repeated_last_column = np.tile(last_column, (5, 1))
result = repeated_last_columnXXX = np.concatenate((XXX, result), axis=1)
pred1 = models1.predict(XXX)
pred2 = models2.predict(XXX)
pred3 = models3.predict(XXX)
pred4 = models4.predict(XXX)y1 = np.array(df['open'][-30:])
y2 = np.array(df['close'][-30:])
y3 = np.array(df['high'][-30:])
y4 = np.array(df['low'][-30:])
YD = np.array(df['date'][-30:])data = {'open': np.concatenate([y1, pred1]),'close': np.concatenate([y2, pred2]),'high': np.concatenate([y3, pred3]),'low': np.concatenate([y4, pred4]),'date':np.concatenate([YD,np.array(future_dates_str)])
}df = pd.DataFrame(data)import mplfinance as mpf# df['date'] = pd.date_range(start=RQ, periods=len(df))
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
# mpf.plot(df, type='candle', title='Stock K-Line')
my_color = mpf.make_marketcolors(up='red',  # 上涨时为红色down='green',  # 下跌时为绿色# edge='i',  # 隐藏k线边缘# volume='in',  # 成交量用同样的颜色inherit=True)my_style = mpf.make_mpf_style(# gridaxis='both',  # 设置网格# gridstyle='-.',# y_on_right=True,marketcolors=my_color)mpf.plot(df, type='candle',style=my_style,# datetime_format='%Y年%m月%d日',title='Stock K-Line')

6、结果(预测下周上证:图中后五天是预测结果)

 总结图中所示:

1、周一到周三略微上涨一点点。

2、下周四五高开高走(令人惊讶)。

如果提前布局的话应该是选择在周四找最低点买入。

全代码,一件运行:

import akshare as ak
import numpy as np
import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
import xgboost as xgbdf = ak.stock_zh_index_daily_em(symbol='sh000001')today = datetime.today()
date_str = today.strftime("%Y%m%d")
base = int(datetime.strptime(date_str, "%Y%m%d").timestamp())
change1 = lambda x: (int(datetime.strptime(x, "%Y%m%d").timestamp()) - base) / 86400
change2 = lambda x: (datetime.strptime(str(x), "%Y%m%d")).day
change3 = lambda x: datetime.strptime(str(x), "%Y%m%d").weekday()df['date'] = df['date'].str.replace('-', '')
X = df['date'].apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day = df['date'].apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day = df['date'].apply(lambda x: change3(x)).values.reshape(-1, 1)
XX = np.concatenate((X, X_week_day, X_month_day), axis=1)[29:]
FT = np.array(df.drop(columns=['date']))
min_vals = np.min(FT, axis=0)
max_vals = np.max(FT, axis=0)
FT = (FT - min_vals) / (max_vals - min_vals)window_size = 30
num_rows, num_columns = FT.shape
new_num_rows = num_rows - window_size + 1
result1 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.mean(window, axis=0)result1[i] = window_meanresult2 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.max(window, axis=0)result2[i] = window_meanresult3 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.min(window, axis=0)result3[i] = window_meanresult4 = np.empty((new_num_rows, num_columns))
for i in range(new_num_rows):window = FT[i: i + window_size]window_mean = np.std(window, axis=0)result4[i] = window_mean
result_list = [result1, result2, result3, result4]
result = np.hstack(result_list)XX = np.concatenate((XX, result), axis=1)y1 = df['open'][29:]
y2 = df['close'][29:]
y3 = df['high'][29:]
y4 = df['low'][29:]
models1 = xgb.XGBRegressor()
models2 = xgb.XGBRegressor()
models3 = xgb.XGBRegressor()
models4 = xgb.XGBRegressor()
models1.fit(XX, y1)
models2.fit(XX, y2)
models3.fit(XX, y3)
models4.fit(XX, y4)start_date = pd.to_datetime(today)bday_cn = CustomBusinessDay(weekmask='Mon Tue Wed Thu Fri')
future_dates = pd.date_range(start=start_date, periods=6, freq=bday_cn)
future_dates_str = [date.strftime('%Y-%m-%d') for date in future_dates][1:]
future_dates_str = pd.Series(future_dates_str).str.replace('-', '')
X_x = future_dates_str.apply(lambda x: change1(x)).values.reshape(-1, 1)
X_month_day_x = future_dates_str.apply(lambda x: change2(x)).values.reshape(-1, 1)
X_week_day_x = future_dates_str.apply(lambda x: change3(x)).values.reshape(-1, 1)
XXX = np.concatenate((X_x, X_week_day_x, X_month_day_x), axis=1)
last_column = result[-1:, ]
repeated_last_column = np.tile(last_column, (5, 1))
result = repeated_last_columnXXX = np.concatenate((XXX, result), axis=1)
pred1 = models1.predict(XXX)
pred2 = models2.predict(XXX)
pred3 = models3.predict(XXX)
pred4 = models4.predict(XXX)y1 = np.array(df['open'][-30:])
y2 = np.array(df['close'][-30:])
y3 = np.array(df['high'][-30:])
y4 = np.array(df['low'][-30:])
YD = np.array(df['date'][-30:])data = {'open': np.concatenate([y1, pred1]),'close': np.concatenate([y2, pred2]),'high': np.concatenate([y3, pred3]),'low': np.concatenate([y4, pred4]),'date':np.concatenate([YD,np.array(future_dates_str)])
}df = pd.DataFrame(data)import mplfinance as mpf# df['date'] = pd.date_range(start=RQ, periods=len(df))
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
# mpf.plot(df, type='candle', title='Stock K-Line')
my_color = mpf.make_marketcolors(up='red',  # 上涨时为红色down='green',  # 下跌时为绿色# edge='i',  # 隐藏k线边缘# volume='in',  # 成交量用同样的颜色inherit=True)my_style = mpf.make_mpf_style(# gridaxis='both',  # 设置网格# gridstyle='-.',# y_on_right=True,marketcolors=my_color)mpf.plot(df, type='candle',style=my_style,# datetime_format='%Y年%m月%d日',title='Stock K-Line')

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/84820.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

java八股文面试[JVM]——垃圾回收器

jvm结构总结 常见的垃圾回收器有哪些? CMS(Concurrent Mark Sweep) 整堆收集器: G1 由于整个过程中耗时最长的并发标记和并发清除过程中,收集器线程都可以与用户线程一起工作,所以总体上来说,…

用docker-compose搭建LNMP

docker-compose搭建LNMP 一、compose 的部署1.Docker Compose 环境安装 二、编写Docker Compose1.准备依赖文件,配置nginx2.配置mysql3.配置php4.编写docker-compose.yml5.执行6.查看 一、compose 的部署 (1)公司在实际的生产环境中,需要使用…

python网络编程

文章目录 socket套接字客户端/服务模型linux文件描述符fdLinux网络IO模型详解网络服务器Apache VS Nginx生产者消费者-生成器版客户端/服务端-多线程版IO多路复用TCPServer模型异步IO多路复用TCPServer模型 socket套接字 套接字(socket)是抽象概念,表示T…

适应高速率网络设备的-2.5G/5G/10G网络变压器/网络滤波器介绍

Hqst盈盛(华强盛)电子导读:在高速发展的互联网/物联网时代,为满足高网速的网络数据传输需求,网络设备在制造中也要选用合适的网络变压器/滤波器产品,有哪些可供选择的高速率网络变压器产品也是广大采购人员…

【Cortex-M3权威指南】学习笔记3 - 存储系统

目录 存储系统存储器映射存储器各种访问属性存储器的缺省访问许可位带操作非对齐数据传输互斥访问端模式 存储系统 存储器映射 CM3 预定义存储器映射简图(CM3 的地址空间是 4GB ) 片上 SRAM:大小 512MB,拥有 1MB 位带区&#xff…

Vue3.0极速入门- 目录和文件说明

目录结构 以下文件均为npm create helloworld自动生成的文件目录结构 目录截图 目录说明 目录/文件说明node_modulesnpm 加载的项目依赖模块src这里是我们要开发的目录,基本上要做的事情都在这个目录里assets放置一些图片,如logo等。componentsvue组件…

LAMP配置与应用

目录 一、LAMP架构的组成 1、WEB资源类型 2、LAMP架构的组成 二、编译安装LAMP 编译安装apache 1、环境准备 2、导入apache相关压缩安装包,然后安装编译环境 3、解压软件包,并移动apr包与apr-util包到安装目录中,并切换到http解压出…

【Java并发】聊聊对象内存布局和syn锁升级过程

对象存储解析:一个空Object对象到底占据多少内存? 对象内存布局 Mark Word占用8字节,类型指针占用8个字节,对象头占用16个字节。 好了,我们来看一下一个Object对占用多少空间, 因为java默认是开启压缩…

前端需要理解的设计模式知识

设计模式的原则:1. 单一职责原则(一个对象或方法只做一件事) 2. 最少知识原则(尽可能少的实体或对象间互相作用) 3. 开放封闭原则(软件实体具有可扩展且不可修改) 设计模式是通过代码设计经验总…

【八股】2023秋招八股复习笔记4(MySQL Redis等)

文章目录 目录1、MySQLmysql索引实现mysql索引优化mysql索引失效的情况mysql 千万数据优化mysql 事务隔离级别 & 实现原理mysql MVCC版本链(undo log)mysql数据同步机制 & 主从复制 (binlog)mysql 日志&数据恢复&…

腾讯云服务器地域和可用区详细介绍_选择攻略

腾讯云服务器地域有什么区别?怎么选择比较好?地域选择就近原则,距离地域越近网络延迟越低,速度越快。关于地域的选择还有很多因素,地域节点选择还要考虑到网络延迟速度方面、内网连接、是否需要备案、不同地域价格因素…

量化QAT QLoRA GPTQ

模型量化的思路可以分为PTQ(Post-Training Quantization,训练后量化)和QAT(Quantization Aware Training,在量化过程中进行梯度反传更新权重,例如QLoRA),GPTQ是一种PTQ的思路。 QAT…