Milvus向量库安装部署

GitHub - milvus-io/milvus-sdk-java: Java SDK for Milvus.

1、安装Standstone 版本

参考:Linux之milvus向量数据库安装_milvus安装-CSDN博客

参考:Install Milvus Standalone with Docker Milvus documentation

一、安装步骤

1、安装docker

  docker的安装见博文Linux之docker安装,这里不再赘述。

2、安装fio命令

   yum install -y fio

3、磁盘性能测试

fio --rw=write --ioengine=sync --fdatasync=1 --directory=test-data --size=2200m --bs=2300 --name=mytest

4、检查CPU支持的指令集

我们使用lscpu命令可以查看CPU支持的指令集,Flags的参数值就是该服务器支持的CPU指令集

lscpu

5、检查docker版本

  根据milvus安装要求,docker版本要求是19.03以上版本,我们这里安装的docker版本为23.0.1,满足要求。

6、安装docker compose组件

  根据milvus安装要求,docker compose版本要求是1.25.1以上,我们这里安装的版本是1.29.2,满足要求。

yum -y install python3-pip

pip3 install --upgrade pip

 pip install docker-compose

下载

wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh

启动 Start Milvus

bash standalone_embed.sh start

  • Stop Milvus

bash standalone_embed.sh stop

  • Connect to Milvus

To delete data after stopping Milvus, run:

bash standalone_embed.sh delete

运行

Run Milvus using Python Milvus documentation

安装 PyMilvus 

python3 -m pip install pymilvus==2.3.6

python3 -m pip install pymilvus  最新版本是2.2.4

python3 -c "from pymilvus import Collection"

wget https://raw.githubusercontent.com/milvus-io/pymilvus/master/examples/hello_milvus.py

Run the example code

# hello_milvus.py demonstrates the basic operations of PyMilvus, a Python SDK of Milvus.
# 1. connect to Milvus
# 2. create collection
# 3. insert data
# 4. create index
# 5. search, query, and hybrid search on entities
# 6. delete entities by PK
# 7. drop collection
import timeimport numpy as np
from pymilvus import (connections,utility,FieldSchema, CollectionSchema, DataType,Collection,
)fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 3000, 8#################################################################################
# 1. connect to Milvus
# Add a new connection alias `default` for Milvus server in `localhost:19530`
# Actually the "default" alias is a buildin in PyMilvus.
# If the address of Milvus is the same as `localhost:19530`, you can omit all
# parameters and call the method as: `connections.connect()`.
#
# Note: the `using` parameter of the following methods is default to "default".
print(fmt.format("start connecting to Milvus"))
# connects to a server
connections.connect("default", host="localhost", port="19530")has = utility.has_collection("hello_milvus")
print(f"Does collection hello_milvus exist in Milvus: {has}")#################################################################################
# 2. create collection
# We're going to create a collection with 3 fields.
# +-+------------+------------+------------------+------------------------------+
# | | field name | field type | other attributes |       field description      |
# +-+------------+------------+------------------+------------------------------+
# |1|    "pk"    |   VarChar  |  is_primary=True |      "primary field"         |
# | |            |            |   auto_id=False  |                              |
# +-+------------+------------+------------------+------------------------------+
# |2|  "random"  |    Double  |                  |      "a double field"        |
# +-+------------+------------+------------------+------------------------------+
# |3|"embeddings"| FloatVector|     dim=8        |  "float vector with dim 8"   |
# +-+------------+------------+------------------+------------------------------+
fields = [FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),FieldSchema(name="random", dtype=DataType.DOUBLE),FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim)
]schema = CollectionSchema(fields, "hello_milvus is the simplest demo to introduce the APIs")print(fmt.format("Create collection `hello_milvus`"))
hello_milvus = Collection("hello_milvus", schema, consistency_level="Strong")################################################################################
# 3. insert data
# We are going to insert 3000 rows of data into `hello_milvus`
# Data to be inserted must be organized in fields.
#
# The insert() method returns:
# - either automatically generated primary keys by Milvus if auto_id=True in the schema;
# - or the existing primary key field from the entities if auto_id=False in the schema.
# inserts vectors in the collection
print(fmt.format("Start inserting entities"))
rng = np.random.default_rng(seed=19530)
entities = [# provide the pk field because `auto_id` is set to False[str(i) for i in range(num_entities)],rng.random(num_entities).tolist(),  # field random, only supports listrng.random((num_entities, dim)),    # field embeddings, supports numpy.ndarray and list
]insert_result = hello_milvus.insert(entities)hello_milvus.flush()
print(f"Number of entities in Milvus: {hello_milvus.num_entities}")  # check the num_entities################################################################################
# 4. create index
# We are going to create an IVF_FLAT index for hello_milvus collection.
# create_index() can only be applied to `FloatVector` and `BinaryVector` fields.
# builds indexes on the entities:
print(fmt.format("Start Creating index IVF_FLAT"))
index = {"index_type": "IVF_FLAT","metric_type": "L2","params": {"nlist": 128},
}hello_milvus.create_index("embeddings", index)################################################################################
# 5. search, query, and hybrid search
# After data were inserted into Milvus and indexed, you can perform:
# - search based on vector similarity
# - query based on scalar filtering(boolean, int, etc.)
# - hybrid search based on vector similarity and scalar filtering.
## Before conducting a search or a query, you need to load the data in `hello_milvus` into memory.
# Loads the collection to memory and performs a vector similarity search:
print(fmt.format("Start loading"))
hello_milvus.load()# -----------------------------------------------------------------------------
# search based on vector similarity
print(fmt.format("Start searching based on vector similarity"))
vectors_to_search = entities[-1][-2:]
search_params = {"metric_type": "L2","params": {"nprobe": 10},
}start_time = time.time()
result = hello_milvus.search(vectors_to_search, "embeddings", search_params, limit=3, output_fields=["random"])
end_time = time.time()for hits in result:for hit in hits:print(f"hit: {hit}, random field: {hit.entity.get('random')}")
print(search_latency_fmt.format(end_time - start_time))# -----------------------------------------------------------------------------
# query based on scalar filtering(boolean, int, etc.)
print(fmt.format("Start querying with `random > 0.5`"))start_time = time.time()
result = hello_milvus.query(expr="random > 0.5", output_fields=["random", "embeddings"])
end_time = time.time()print(f"query result:\n-{result[0]}")
print(search_latency_fmt.format(end_time - start_time))# -----------------------------------------------------------------------------
# pagination
r1 = hello_milvus.query(expr="random > 0.5", limit=4, output_fields=["random"])
r2 = hello_milvus.query(expr="random > 0.5", offset=1, limit=3, output_fields=["random"])
print(f"query pagination(limit=4):\n\t{r1}")
print(f"query pagination(offset=1, limit=3):\n\t{r2}")# -----------------------------------------------------------------------------
# hybrid search
print(fmt.format("Start hybrid searching with `random > 0.5`"))start_time = time.time()
result = hello_milvus.search(vectors_to_search, "embeddings", search_params, limit=3, expr="random > 0.5", output_fields=["random"])
end_time = time.time()for hits in result:for hit in hits:print(f"hit: {hit}, random field: {hit.entity.get('random')}")
print(search_latency_fmt.format(end_time - start_time))###############################################################################
# 6. delete entities by PK
# You can delete entities by their PK values using boolean expressions.
ids = insert_result.primary_keysexpr = f'pk in ["{ids[0]}" , "{ids[1]}"]'
print(fmt.format(f"Start deleting with expr `{expr}`"))result = hello_milvus.query(expr=expr, output_fields=["random", "embeddings"])
print(f"query before delete by expr=`{expr}` -> result: \n-{result[0]}\n-{result[1]}\n")hello_milvus.delete(expr)result = hello_milvus.query(expr=expr, output_fields=["random", "embeddings"])
print(f"query after delete by expr=`{expr}` -> result: {result}\n")###############################################################################
# 7. drop collection
# Finally, drop the hello_milvus collection
print(fmt.format("Drop collection `hello_milvus`"))
utility.drop_collection("hello_milvus")

python3 hello_milvus.py

docker ps

192.168.1.242:9091/api/v1/health

使用浏览器访问连接地址http://ip:9091/api/v1/health,返回{“status”:“ok”}说明milvus数据库服务器运行正常。

docker port milvus-standalone

安装Attu

参考:https://github.com/zilliztech/attu/blob/main/doc/zh-CN/attu_install-docker.md

执行:

docker run -p 8000:3000 -e MILVUS_URL=192.168.1.242:19530 zilliz/attu:latest

待参考:kubernetes部署milvus_milvus集群版-CSDN博客

具体使用:

参考:Milvus技术探究 - 知乎 (zhihu.com)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/479163.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

机器学习基础(二)监督与非监督学习

导语:更深入地探讨监督学习和非监督学习的知识,重点关注它们的理论基础、常用算法及实际应用场景。 上一节我们深入探索机器学习的根本原理,包括基本概念、分类及如何通过构建预测模型来应用这些理论,详情可见: 机器学…

【力扣 - 对称二叉树】

题目描述 给你一个二叉树的根节点 root &#xff0c; 检查它是否轴对称。 提示&#xff1a; 树中节点数目在范围 [1, 1000] 内 -100 < Node.val < 100题解1 /*** Definition for a binary tree node.* struct TreeNode {* int val;* struct TreeNode *lef…

Kernelized Correlation Filters KCF算法原理详解(阅读笔记)(待补充)

KCF 目录 KCF预备知识1. 岭回归2. 循环移位和循环矩阵3. 傅里叶对角化4. 方向梯度直方图&#xff08;HOG&#xff09; 正文1. 线性回归1.1. 岭回归1.2. 基于循环矩阵获取正负样本1.3. 基于傅里叶对角化的求解 2. 使用非线性回归对模型进行训练2.1. 应用kernel-trick的非线性模型…

GZ036 区块链技术应用赛项赛题第8套

2023年全国职业院校技能大赛 高职组 “区块链技术应用” 赛项赛卷&#xff08;8卷&#xff09; 任 务 书 参赛队编号&#xff1a; 背景描述 现实中患者私密信息泄露情况时有发生&#xff0c;医疗部门的柜式存储和纸质记录已不再是最优选择。在2015-2016年间&…

汽车控制器软件正向开发

需求常见问题: 1.系统需求没有分层,没有结构化,依赖关系不明确 2.需求中没有验证准则 3.对客户需求的追溯缺失,不完整,颗粒度不够 4.系统需求没有相应的系统架构,需求没有分解到硬件和软件 5.需求变更管控不严格,变更频繁,变更纪录描述不准确,有遗漏,客户需求多…

AI新工具(20240219) Ollama Windows预览版;谷歌开源的人工智能文件类型识别系统; PopAi是您的个人人工智能工作空间

Ollama Windows preview - Ollama Windows预览版用户可以在本地创建和运行大语言模型&#xff0c;并且支持NVIDIA GPU和现代CPU指令集的硬件加速 Ollama发布了Windows预览版&#xff0c;使用户能够在原生的Windows环境中拉取、运行和创建大语言模型。该版本支持英伟达的GPU&am…

【4.1计算机网络】TCP-IP协议簇

目录 1.OSI七层模型2.常见协议及默认端口3.TCP与UDP的区别 1.OSI七层模型 osi七层模型&#xff1a; 1.应用层 2.表示层 3.会话层 4.传输层&#xff1a;TCP为可靠的传输层协议。 5.网络层 6.数据链路层 7.物理层 2.常见协议及默认端口 3.TCP与UDP的区别 例题1. 解析&#xff1…

VTK通过线段裁剪

线段拆分网格 void retrustMesh(vtkSmartPointer<vtkPolyData> polydata, vtkSmartPointer<vtkPoints> intermediatePoint) {vtkSmartPointer<vtkPoints> srcPoints polydata->GetPoints();int pointSize intermediatePoint->GetNumberOfPoints();/…

小迪安全2023最新版笔记集合--续更

小迪安全2023最新版笔记集合–续更 小迪安全2023最新笔记集合 章节一 ---- 基础入门&#xff1a; 知识点集合&#xff1a; 应用架构&#xff1a;Web/APP/云应用/三方服务/负载均衡等 安全产品&#xff1a;CDN/WAF/IDS/IPS/蜜罐/防火墙/杀毒等 渗透命令&#xff1a;文件上传下…

欠定方程组及其求解

欠定方程组是指方程的数量少于未知数的数量的方程组。在这种情况下&#xff0c;通常有无限多个解&#xff0c;因为给定的方程不足以唯一确定所有未知数的值。在某些情况下&#xff0c;我们可以利用额外的信息或假设&#xff0c;如稀疏性或其他约束&#xff0c;来找到一个合理的…

如何保护文件夹数据?怎么加密文件夹数据?

文件夹是电脑储存、管理数据的重要工具&#xff0c;为了避免文件夹数据泄露&#xff0c;我们需要加密保护文件夹。那么&#xff0c;怎么加密文件夹呢&#xff1f;下面我们就来了解一下。 如何避免文件夹数据泄露&#xff1f; 保护文件夹数据安全的方法有很多&#xff0c;例如…

169基于matlab的小波神经网络预测

基于matlab的小波神经网络预测&#xff0c;通过权值参数更新得到误差较小模型&#xff0c;进行多输出单输出预测。输出预测可视化结果。程序已调通&#xff0c;可直接运行。 169matlab小波神经网络预测 多输入单输出 (xiaohongshu.com)