Opentelemetry——Sampling

Sampling

采样

Learn about sampling, and the different sampling options available in OpenTelemetry.
了解采样以及 OpenTelemetry 中提供的不同采样选项。

With distributed tracing, you observe requests as they move from one service to another in a distributed system. It’s superbly practical for a number of reasons, such as understanding your service connections and diagnosing latency issues, among many other benefits.
使用分布式跟踪,您可以观察请求在分布式系统中从一个服务到另一个服务的传递情况。出于多种原因,这对于理解服务连接和诊断延迟问题非常实用。

However, if the majority of all your requests are successful 200s and finish without unacceptable latency or errors, do you really need all that data? Here’s the thing—you don’t always need a ton of data to find the right insights. You just need the right sampling of data.
但是,如果您的所有请求中的大多数都在 200 秒内成功完成并且没有出现不可接受的延迟或错误,那么您真的需要所有这些数据吗?事情是这样的——你并不总是需要大量数据才能找到正确的洞察力。您只需要正确的数据采样。
在这里插入图片描述

The idea behind sampling is to control the spans you send to your observability backend, resulting in lower ingest costs. Different organizations will have their own reasons for not just why they want to sample, but also what they want to sample. You might want to customize your sampling strategy to:
采样的核心思想是控制您发送到可观测后端的Span,从而降低摄取成本。不同的组织不仅有自己的采样原因,而且也有自己想要采样的对象。您可能需要自定义采样策略以实现以下目标:

  • Manage costs: If you have a high volume of telemetry, you risk incurring heavy charges from a telemetry backend vendor or cloud provider to export and store every span.
    管理成本:如果您有大量遥测数据,则可能会因导出和存储每个Span数据,而被遥测后端供应商或云提供商收取高额费用。
  • Focus on interesting traces: For example, your frontend team may only want to see traces with specific user attributes.
    关注有趣的Trace:例如,您的前端团队可能只想查看具有特定用户属性的Trace。
  • Filter out noise: For example, you may want to filter out health checks.
    过滤噪音:例如,您可能想要过滤掉健康检查。

Terminology

术语

It’s important to use consistent terminology when discussing sampling. A trace or span is considered “sampled” or “not sampled”:
在讨论采样时使用一致的术语非常重要。Trace或Span被视为“已采样”或“未采样”:

  • Sampled: A trace or span is processed and exported. Because it is chosen by the sampler as a representative of the population, it is considered “sampled”.
    已采样:处理并导出Trace或Span。因为它是由采样器选择作为总体的代表,所以它被认为是“已采样”。
  • Not sampled: A trace or span is not processed or exported. Because it is not chosen by the sampler, it is considered “not sampled”.
    未采样:未处理或未导出的Trace或Span。因为它不是由采样器选择的,所以被认为是“未采样”。

Sometimes, the definitions of these terms get mixed up. You may find someone state that they are “sampling out data” or that data not processed or exported is considered “sampled”. These are incorrect statements.
有时,这些术语的定义会混淆。您可能会发现有人声称他们正在“采样数据”,比如未处理或导出的数据被视为“采样”。这些都是不正确的说法。

Head Sampling

头部采样

Head sampling is a sampling technique used to make a sampling decision as early as possible. A decision to sample or drop a span or trace is not made by inspecting the trace as a whole.
头部采样是一种用于尽早做出采样决定的采样技术。对Span或Trace进行采样或删除的决定不是通过检查整个Trace来做出的。

For example, the most common form of head sampling is Consistent Probability Sampling. It may also be referred to as Deterministic Sampling. In this case, a sampling decision is made based on the trace ID and a desired percentage of traces to sample. This ensures that whole traces are sampled - no missing spans - at a consistent rate, such as 5% of all traces.
例如,最常见的头部采样形式是 一致概率采样。它也可以称为确定性采样。在这种情况下,采样决策是根据Trace ID和所需的采样Trace百分比做出的。这可确保对整个Trace保持一致的速率(例如所有Trace的 5%)进行采样,且不会丢失Span。

The upsides to head sampling are:
头部采样的优点是:

  • Easy to understand
    容易明白
  • Easy to configure
    易于配置
  • Efficient
    高效的
  • Can be done at any point in the trace collection pipeline
    可以在Trace收集流程中的任何阶段完成

The primary downside to head sampling is that it is not possible make a sampling decision based on data in the entire trace. This means that head sampling is effective as a blunt instrument, but is wholly insufficient for sampling strategies that must take whole-system information into account. For example, it is not possible to use head sampling to ensure that all traces with an error within them are sampled. For this, you need Tail Sampling.
头部采样的主要缺点是无法根据整个Tace中的数据做出采样决策。这意味着首部采样在某种程度上会有效,但对于必须考虑整个系统信息的采样策略来说完全不够。例如,不可能使用头部采样来确保对其中有错误的所有Trace进行采样。为此,您需要尾部采样。

Tail Sampling

尾部取样

Tail sampling is where the decision to sample a trace takes place by considering all or most of the spans within the trace. Tail Sampling gives you the option to sample your traces based on specific criteria derived from different parts of a trace, which isn’t an option with Head Sampling.
尾部采样是通过考虑Trace内的全部或大部分Span来决定对Trace哪些地方的Span进行采样的方法。尾部采样允许您根据Trace的不同部分派生出的特定条件对Trace进行采样,而头部采样则无法提供此选项。

在这里插入图片描述
Some examples of how you can use Tail Sampling include:
使用尾部采样的一些示例包括:

  • Always sampling traces that contain an error
    始终对包含错误的Trace进行采样
  • Sampling traces based on overall latency
    基于总体延迟的的Trace采样
  • Sampling traces based on the presence or value of specific attributes on one or more spans in a trace; for example, sampling more traces originating from a newly deployed service
    根据Trace中一个或多个Span上特定属性的存在与否或值对Trace进行采样;例如,对源自新部署的服务的更多Trace进行采样
  • Applying different sampling rates to traces based on certain criteria
    根据某些条件对Trace采用不同的采样率

As you can see, tail sampling allows for a much higher degree of sophistication. For larger systems that must sample telemetry, it is almost always necessary to use Tail Sampling to balance data volume with usefulness of that data.
正如您所看到的,尾部采样可以实现更高程度的复杂性。对于必须对Telemetry数据进行采样的大型系统,几乎总是需要使用尾部采样来平衡数据量和数据的有用性。

There are three primary downsides to tail sampling today:
目前尾部采样存在三个主要缺点:

  • Tail sampling can be difficult to implement. Depending on the kind of sampling techniques available to you, it is not always a “set and forget” kind of thing. As your systems change, so too will your sampling strategies. For a large and sophisticated distributed system, rules that implement sampling strategies can also be large and sophisticated.
    尾部采样可能很难实现。根据您可用的采样技术类型,它并不总是“设置后忘记”的事情。随着您的系统发生变化,您的采样策略也会发生变化。对于大型且复杂的分布式系统,实现采样策略的规则也可能庞大且复杂。
  • Tail sampling can be difficult to operate. The component(s) that implement tail sampling must be stateful systems that can accept and store a large amount of data. Depending on traffic patterns, this can require dozens or even hundreds of nodes that all utilize resources differently. Furthermore, a tail sampler may need to “fall back” to less computationally-intensive sampling techniques if it is unable to keep up with the volume of data it is receiving. Because of these factors, it is critical to monitor tail sampling components to ensure that they have the resources they need to make the correct sampling decisions.
    尾部取样可能很难操作。实现尾部采样的组件必须是可以接受和存储大量数据的有状态系统。根据流量模式,这可能需要数十个甚至数百个节点,这些节点都以不同的方式利用资源。此外,如果尾部采样器无法跟上正在接收的数据量,则可能需要“回退”到计算强度较低的采样技术。由于这些因素,监控尾部采样组件以确保它们拥有做出正确采样决策所需的资源至关重要。
  • Tail samplers often end up being in the domain of vendor-specific technology today. If you’re using a paid vendor for Observability, the most effective tail sampling options available to you may be limited to what the vendor offers.
    如今,尾部采样器通常最终属于特定于供应商的技术领域。如果您使用付费供应商来实现可观测性,那么您可用的最有效的尾部采样选项可能仅限于该供应商提供的选项。

Finally, for some systems, tail sampling may be used in conjunction with Head Sampling. For example, a set of services that produce an extremely high volume of trace data may first use head sampling to only sample a small percentage of traces, and then later in the telemetry pipeline use tail sampling to make more sophisticated sampling decisions before exporting to a backend. This is often done in the interest of protecting the telemetry pipeline from being overloaded.
最后,对于某些系统,尾部采样可以与头部采样结合使用。例如,生成大量跟踪数据的一组服务可能首先使用头部采样仅对一小部分跟踪进行采样,然后在Telemetry管道中使用尾部采样做出更复杂的采样决策,然后再导出到后端。这样做通常是为了防止Telemetry管道过载。

Support

Collector

The OpenTelemetry Collector includes the following sampling processors:
OpenTelemetry Collector 包括以下采样处理器:

  • Probabilistic Sampling Processor
    概率采样处理器
  • Tail Sampling Processor
    尾部采样处理器

Language SDKs

For the individual language specific implementations of the OpenTelemetry API & SDK you will find support for sampling at the respective documentation pages:
对于 OpenTelemetry API 和 SDK 的各个语言特定实现,您可以在相应的文档页面找到采样支持:

  • Erlang/Elixir
  • Go
  • JavaScript
  • Ruby

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/625167.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【六】fastapi+vue前后端分离项目

前端代码 https://gitee.com/feiminjie/helloworldfront 后端代码 https://gitee.com/feiminjie/helloworld 整体效果 首页 用例管理页 用例详情页

设计模式—门面模式

定义: 门面模式,也称为外观模式,是一种结构型设计模式。它的主要目的是提供统一的接口来访问子系统中的多个接口,从而简化客户端与复杂子系统之间的交互。 在门面模式中,一个门面类充当中介,为客户端提供一个简化了的访问方式&…

OpenStack:开源云计算的崛起与发展

目录 一,引言 二,OpenStack的起源 三,OpenStack的版本演进 四,OpenStack跟虚拟化的区别 五,OpenStack组件介绍 1)Horizon介绍 2)KeyStone介绍 Keystone 功能概览 Keystone 架构详解 3&a…

【ElasticSearch】安装

1.官网寻找合适版本下载 这里我选择的是8.11.1 2.解压并启动 然后在浏览器输入http://localhost:9200/,判断是否启动成功 如下所示,则表示启动成功 安装过程中遇到过几个bug,记录在这篇文章中 【ElasticSearch】安装(bug篇&am…

聚道云软件连接器助力企业实现滴滴出差报销自动化

一、客户介绍 某机械有限公司是一家在机械设备制造领域拥有深厚底蕴和卓越实力的企业。自公司成立以来,该公司始终秉承创新、务实、高效的发展理念,专注于机械设备的研发、生产和销售。经过多年的发展,公司已成为国内机械行业的佼佼者&#…

分享一个纯HTML的后台数据统计管理UI框架模板源码

纯HTML的后台数据统计管理UI框架模板源码 有很多好看的样式 以及各种图表 表格 大量的图标 源码 <!DOCTYPE html> <html lang"en"><head><!-- Required meta tags --><meta charset"utf-8"><meta name"viewport&qu…

Vue Router基础知识整理

Vue Router基础知识整理 1. 安装与使用&#xff08;Vue3&#xff09;安装使用 2. 配置路径别名和VSCode路径提示&#xff08;了解&#xff09;3. 使用查询字符串或路径传参query动态路由 与 params 4. router-link、定义别名、定义路由名称、编程式导航定义别名 aliasrouter-li…

本地搭建属于你自己的AI搜索引擎 支持多家AI模型

FreeAskInternet 是一个完全免费、私有且本地运行的搜索聚合器&#xff0c;并使用 MULTI LLM 生成答案&#xff0c;无需 GPU。用户可以提出问题&#xff0c;系统将进行多引擎搜索&#xff0c;并将搜索结果合并到LLM中&#xff0c;并根据搜索结果生成答案。全部免费使用。 项目…

笔记本电脑上的聊天机器人: 在英特尔 Meteor Lake 上运行 Phi-2

对应于其强大的能力&#xff0c;大语言模型 (LLM) 需要强大的算力支撑&#xff0c;而个人计算机上很难满足这一需求。因此&#xff0c;我们别无选择&#xff0c;只能将它们部署至由本地或云端托管的性能强大的定制 AI 服务器上。 为何需要将 LLM 推理本地化 如果我们可以在典配…

VScode配置launch+tasks[自己备用]

VScode配置launchtasks[自己备用]&#xff0c;配置文件详解 launch.json 字段 name &#xff1a;启动配置的名称&#xff0c;也就是显示在调试配置下拉菜单中的名字&#xff0c;如果添加了多个配置可以用此作为区分 字段 program &#xff1a;可执行文件完整路径。 ① 由于 C…

8thWall vs. AR.js

对于熟悉 JavaScript、WebGL 和 HTML5 等 Web 技术的数字创作者来说&#xff0c;8th Wall 提供了功能丰富且强大的 AR 开发平台&#xff0c;尽管价格较高。 然而&#xff0c;新手开发人员和专注于基于标记的 AR 的开发人员可能会发现 AR.js 更易于使用且更经济实惠。 1、8th Wa…

STM32 F103 C8T6开发笔记14:与HLK-LD303-24G测距雷达通信

今日尝试配通STM32 F103 ZET6与HLK-LD303-24G测距雷达的串口通信解码 文章提供测试代码...... 目录 HLK-LD303-24G测距雷达外观&#xff1a; 线路连接准备&#xff1a; 定时器与串口配置准备&#xff1a; 定时器2的初始化&#xff1a; 串口1、2初始化&#xff1a; 串口1、2自定…