ElasticSearch入门教程--集群搭建和版本比较

文章目录

  • 一、ElasticSearch 集群
  • 二、Elasticsearch的核心概念
    • 2.1、分片(Shards)
    • 2.2、副本(Replicas)
    • 2.3、路由计算
    • 2.4、倒排索引
  • 三、Kibana简介
  • 四、Spring Data ElasticSearch

一、ElasticSearch 集群

Elasticsearch 集群有一个唯一的名字,默认就是”elasticsearch”。,一个节点只能通过指定某个集群的名字,来加入这个集群。
在这里插入图片描述

集群搭建如下:

  1. 复制ES的安装目录三份:esnode-1,esnode-2,esnode-3,分别编辑config/elasticsearch.yml 配置文件

node-1:

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-elasticsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
node.master: true
node.data: true 
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: localhost
#
# Set a custom port for HTTP:
#
http.port: 1001
#tcp 监听端口
transport.tcp.port: 9301#discovery.seed_hosts: ["localhost:9201", "localhost:9301", "localhost:9401"]
#discovery.zen.fd.ping_timeout: 1m
#discovery.zen.fd.ping_retries: 5
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["localhost:9301", "localhost:9401"]
#discovery.zen.fd.ping_timeout: 1m
#discovery.zen.fd.ping_retries: 5
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true#跨域配置http.cors.enabled: true
http.cors.allow-origin: "*"

node-2:

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-elasticsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-2
node.master: true
node.data: true
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: localhost
#
# Set a custom port for HTTP:
#
http.port: 1002
transport.tcp.port: 9302#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#discovery.seed_hosts: ["localhost:9301"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#跨域配置
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

node-3:

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-elasticsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-3
node.master: true
node.data: true
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: localhost
#
# Set a custom port for HTTP:
#
http.port: 1003
transport.tcp.port: 9303
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["localhost:9302", "localhost:9301"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#跨域配置
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

2、分别启动,浏览器访问http://localhost:1001/_cluster/healthhttp://localhost:1002/_cluster/healthhttp://localhost:1003/_cluster/health,集群启动成功。

在这里插入图片描述
搭建成功后,可以将请求发送到集群中的任何节点 ,包括主节点。 每个节点都知道文档所处的位置,并且能够将我们的请求直接转发到存储我们所需文档的节点。 无论我们将请求发送到哪个节点,它都能负责从各个包含我们所需文档的节点收集回数据,并将最终结果返回給客户端。 Elasticsearch 对这一切的管理都是透明的。

二、Elasticsearch的核心概念

2.1、分片(Shards)

一个索引可以存储超出单个节点硬件限制的大量数据。比如,一个具有 10 亿文档数据的索引占据 1TB 的磁盘空间,而任一节点都可能没有这样大的磁盘空间。或者单个节点处理搜索请求,响应太慢。为了解决这个问题,Elasticsearch 提供了将索引划分成多份的能力,每一份就称之为分片。当你创建一个索引的时候,你可以指定你想要的分片的数量。每个分片本身也是一个功能完善并且独立的“索引”,这个“索引”可以被放置到集群中的任何节点上。

分片很重要,主要有两方面的原因:
1)允许水平分割 / 扩展内容容量。
2)允许在分片之上进行分布式的、并行的操作,进而提高性能/吞吐量。

当 Elasticsearch 在索引中搜索的时候, 他发送查询到每一个属于索引的分片(Lucene 索引),然后合并每个分片的结果到一个全局的结果集。

2.2、副本(Replicas)

Elasticsearch 允许你创建分片的一份或多份拷贝,这些拷贝叫做复制分片(副本)。

复制分片之所以重要,有两个主要原因:

提供了高可用性。复制分片从不与原分片置于同一节点上是非常重要的。

扩展搜索量/吞吐量,因为搜索可以在所有的副本上并行运行。当拥有越多的副本分片时,也将拥有越高的吞吐量。

每个索引可以被分成多个分片。一经复制,每个索引就有了主分片(作为复制源的原来的分片)和复制分片(主分片的拷贝)之别。分片和复制的数量可以在索引创建的时候指定。在索引创建之后,你可以在任何时候动态地改变复制的数量,但是不能改变分片的数量

默认情况下,Elasticsearch 中的每个索引被分片 1 个主分片和 1 个复制,根据索引需要确定分片个数。

2.3、路由计算

Elasticsearch 如何知道一个文档应该存放到哪个分片中呢?当我们创建文档时,它如何决定这个文档应当被存储在分片1 还是分片 2 中呢?

这个过程是根据下面这个公式决定的:

shard = hash(routing) % number of primary_shards

routing 是一个可变值,默认是文档的 _id ,也可以设置成一个自定义的值。 routing 通过hash 函数生成一个数字,然后这个数字再除以 number_of_primary_shards (主分片的数量)后得到的余数 就是文档所在分片的位置。

2.4、倒排索引

Elasticsearch 使用一种倒排索引的结构,适用于快速的全文搜索。

我们经常了解的是正向索引(forward index),所谓正向索引,就是搜索引擎会将待搜索的文件都对应一个文件 ID,搜索时将这个ID 和搜索关键字进行对应,形成 K-V 对,然后对关键字进行统计计数,常见Mysql中的表结构。

倒排索引,搜索引擎会将正向索引重新构建为倒排索引,即把文件ID对应到关键词的映射转换为关键词到文件ID的映射,每个关键词都对应着一系列的文件,这些文件中都出现这个关键词。

三、Kibana简介

Kibana 是一个免费且开放的用户界面,能够让你对 Elasticsearch 数据进行可视化,并让你在 Elastic Stack 中进行导航。你可以进行各种操作,从跟踪查询负载,到理解请求如何流经你的整个应用,都能轻松完成。

下载地址:点击下载windows版本

修改 config/kibana.yml 文件

server.port: 5601
server.host: "127.0.0.1"
elasticsearch.hosts: ["http://127.0.0.1:9200"]
kibana.index: ".kibana"
i18n.locale: "zh-CN"

运行行 bin/kibana.bat ,浏览器访问 http://localhost:5601

在这里插入图片描述

四、Spring Data ElasticSearch

Spring Data Elasticsearch 基于 spring data API 简化 Elasticsearch 操作,将原始操作Elasticsearch 的客户端 API 进行封装 。

Spring Data 为 Elasticsearch 项目提供集成搜索引擎。

1

官方网站: 点击进入

集成步骤:
1、新建Maven项目。POM文件添加依赖:

 <dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-devtools</artifactId><scope>runtime</scope><optional>true</optional></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-test</artifactId></dependency><dependency><groupId>org.springframework</groupId><artifactId>spring-test</artifactId><version>5.2.13.RELEASE</version></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId></dependency></dependencies>

2、application.yml 配置:

# es 服务地址
elasticsearch:host: localhostport: 9200

3、Entity类:

package com.swc.es.entity;import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.ToString;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;import java.util.Date;@Data
@AllArgsConstructor
@NoArgsConstructor
@ToString
@Document(indexName = "user", shards = 3, replicas = 1)
public class User {@Idprivate Long id;private String name;@Field(index = false)private String sex;private int age;private String address;private Date birth;}

4、增加配置类,读取配置信息

ElasticsearchRestTemplate 是基于 RestHighLevelClient 客户端的。只要自定义配置类,继承AbstractElasticsearchConfiguration,并实现 elasticsearchClient()抽象方法,创建 RestHighLevelClient 对
象。

public class ElasticSearchConfig extends AbstractElasticsearchConfiguration {@Value("elasticsearch.host")private String host ;@Value("elasticsearch.port")private Integer port ;@Overridepublic RestHighLevelClient elasticsearchClient() {RestClientBuilder builder = RestClient.builder(new HttpHost(host, port));RestHighLevelClient restHighLevelClient = newRestHighLevelClient(builder);return restHighLevelClient;}
}

5、DAO 数据访问对象

@Repository
public interface UserDao extends ElasticsearchRepository<User, Long> {
}

6、测试:

索引测试:

@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataESTest {//注入 ElasticsearchRestTemplate@Autowiredprivate ElasticsearchRestTemplate elasticsearchRestTemplate;@Autowiredprivate UserDao userDao;@Testpublic void createIndex(){//System.out.println("创建索引,系统初始化会自动创建索引");}@Testpublic void deleteIndex(){//创建索引,系统初始化会自动创建索引boolean flg = elasticsearchRestTemplate.deleteIndex(User.class);System.out.println("删除索引 = " + flg);}}

文档查询测试:

package com.swc.es.controller;import com.swc.es.dao.UserDao;
import com.swc.es.entity.User;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Sort;
import org.springframework.test.context.junit4.SpringRunner;import java.util.ArrayList;
import java.util.Date;
import java.util.List;@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataESUserDaoTest {@Autowiredprivate UserDao userDao;/*** 新增*/@Testpublic void save(){User user = new User();user.setId(1L);user.setName("张三");user.setAge(25);user.setBirth(new Date());user.setAddress("北京市丰台区");userDao.save(user);}//修改@Testpublic void update(){User user = new User();user.setId(5L);user.setName("张三大哥");user.setSex("男");user.setAge(25);user.setBirth(new Date());user.setAddress("北京市丰台区");userDao.save(user);}//根据 id 查询@Testpublic void findById(){User user = userDao.findById(5L).get();System.out.println(user);}//查询所有@Testpublic void findAll(){Iterable<User> Users = userDao.findAll();for (User user : Users) {System.out.println(user);}}//删除@Testpublic void delete(){User user = new User();user.setId(1L);userDao.delete(user);}//批量新增@Testpublic void saveAll(){List<User> userList = new ArrayList<>();for (int i = 2; i < 10; i++) {User user = new User();user.setId(Long.valueOf(i));user.setName("张三大哥");user.setAge(25+i);user.setBirth(new Date());user.setAddress("北京市丰台区"+i+"号院");userList.add(user);}userDao.saveAll(userList);}//分页查询@Testpublic void findByPageable(){//设置排序(排序方式,正序还是倒序,排序的 id)Sort sort = Sort.by(Sort.Direction.DESC,"id");int currentPage=0;//当前页,第一页从 0 开始,1 表示第二页int pageSize = 5;//每页显示多少条//设置查询分页PageRequest pageRequest = PageRequest.of(currentPage, pageSize,sort);//分页查询Page<User> UserPage = userDao.findAll(pageRequest);for (User user : UserPage.getContent()) {System.out.println(user);}}/*** term 查询* search(termQueryBuilder) 调用搜索方法,参数查询构建器对象*/@Testpublic void termQuery(){TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "张三");Iterable<User> users = userDao.search(termQueryBuilder);for (User user : users) {System.out.println(user);}}/*** term 查询加分页*/@Testpublic void termQueryByPage(){int currentPage= 0 ;int pageSize = 3;//设置查询分页PageRequest pageRequest = PageRequest.of(currentPage, pageSize);TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "张三");Iterable<User> users = userDao.search(termQueryBuilder,pageRequest);for (User user : users) {System.out.println(user);}}
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/22739.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

CAD2021安装教程适合新手小白【附安装包和手册】

提示&#xff1a;文章写完后&#xff0c;目录可以自动生成&#xff0c;如何生成可参考右边的帮助文档 文章目录 前言一、下载文件二、使用步骤1.安装软件前&#xff0c;断开电脑网络&#xff08;拔掉网线、关闭WIFI&#xff09;2、鼠标右击【AutoCAD2021(64bit)】压缩包选择【解…

无线电音频-BPA600蓝牙协议分析仪名词解析

1 介绍 2 Baseband基带分析 (1)Delta 是什么含义? "Delta" 有多个含义,取决于上下文。以下是常见的几种含义: 希腊字母:Delta&#x

【云原生】 一文了解Docker到底是什么?

目录 1.docker是什么&#xff1f; 2.为什么需要docker&#xff1f; 3.docker特点 4.docker架构 5.云计算中的服务包括三个层面 6.传统虚拟化架构 7.容器架构 8.docker系统架构 Docker 守护进程 Docker 客户端 Docker 仓库 Docker 对象 Images&#xff08;镜像&#xff09; Cont…

【网络安全】渗透测试工具——Burp Suite

渗透测试工具Burp Suite主要功能详解 前言一、 Proxy模块1.1 界面布局1.1.1 菜单栏&#xff08;1&#xff09; 菜单栏 Burp&#xff08;2&#xff09; 菜单栏 project&#xff08;3&#xff09; 菜单栏 Intruder&#xff08;4&#xff09; 菜单栏 Repeater&#xff08;5&#x…

【PHP面试题44】PHP5的版本和PHP7之间有哪些区别

文章目录 一、前言二、底层调整2.1性能提升2.2 新的引擎2.3 数据类型改进2.4 错误处理改进2.5 语言特性增加 三、应用层差异3.1 兼容性3.2 类和方法改进3.3 错误处理机制3.4 性能优化3.5 新的扩展支持 四、一些语法糖示例4.1 标量类型声明示例4.2 新增了Spaceship操作符&#x…

【Spring】使用注解读取和存储Bean对象

哈喽&#xff0c;哈喽&#xff0c;大家好~ 我是你们的老朋友&#xff1a;保护小周ღ 谈起Java 圈子里的框架&#xff0c;最年长最耀眼的莫过于 Spring 框架啦&#xff0c;本期给大家带来的是&#xff1a; 将对象存储到 Spring 中、Bean 对象的命名规则、从Spring 中获取bean …

数据库基本操作--------MySQL 索引

目录 一、MySQL 索引 1&#xff0e;索引的概念 2&#xff0e;索引的作用 3&#xff0e;创建索引的原则依据 4&#xff0e;索引的分类和创建 &#xff08;1&#xff09;普通索引 ●直接创建索引 &#xff08;2&#xff09;唯一索引 &#xff08;3&#xff09;主键索引 ●创…

Java InetAddress类

【InetAddress类】 【相关方法】 【使用方法实例】 【代码结果】

leetcode-541. 反转字符串 II

leetcode-541. 反转字符串 II 文章目录 leetcode-541. 反转字符串 II一.题目描述二.第1次提交(for循环&#xff0c;std::reverse)三.第2次提交四.第3次提交五.第4次提交六.代码随想录解答一七.代码随想录解答二八.代码随想录解答三 一.题目描述 二.第1次提交(for循环&#xff0…

IDE /字符串 /字符编码与文本文件(如cpp源代码文件)

文章目录 概述文本编辑器如何识别文件的编码格式优先推测使用了UTF-8编码&#xff1f;字符编码的BOM字节序标记重分析各文本编辑器下的测试效果Qt Creator的文本编辑器系统记事本VS的文本编辑器Notepad 编译器与代码文件的字符编码ANSI编码其他 概述 前期在整理 《IDE/VS项目属…

云计算的学习(六)

六、云计算的发展趋势 1.云计算相关领域介绍 1.1物联网 物联网来源于互联网&#xff0c;是万物互联的结果&#xff0c;是人和物、物和物之间产生通信和交互。 物联网主要技术&#xff1a; RFID技术&#xff08;射频识别技术&#xff09;传感器技术嵌入式系统技术 1.2大数据…

基于LoRa技术的网络终端无线程序升级系统研究(学习)

摘要 设计了一种基于LoRa技术的STM32F4无线程序升级系统。此系统由PC及相关STM32软件开发环境、LoRa通信模块及控制器和STM32F4终端三部分组成。 本系统采用LoRa技术将程序数据无线发送到终端&#xff0c;终端通过IAP技术实现远程无线程序自动升级。测试结果表明&#xff0c;…