VTune+Sampling Drivers环境搭建(本地和远程)

文章目录

  • 一、实验环境
  • 二、Vtune安装
    • 2.1 下载
    • 2.2 安装
    • 2.3 测试
    • 2.4 检查
    • 2.5 部分功能开启
      • 2.5.1 ptrace
      • 2.5.2 Sampling Drivers
    • 2.6 Memory Access功能
  • 三、安装Sampling Drivers
    • 3.1 Sampling Drivers下载
    • 3.2 Sampling Drivers编译
    • 3.3 Sampling Drivers安装
    • 3.4 Sampling Drivers开机启动
    • 3.5 测试
      • 3.5.1 [可选] 图形化界面(查看Memory Access功能)
      • 3.5.2 重新检查功能
  • 四、远程 VTune Profiler
    • 4.1 准备工作
      • 4.1.1 安装VTune(本地和远程)
      • 4.1.2 配置SSH免密登陆
      • 4.1.2 尝试连接

一、实验环境

ubuntu 20.04

二、Vtune安装

2.1 下载

下载地址: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler-download.html

2.2 安装

安装方式有多种,我选择了离线安装,具体安装为

wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/4466ed1b-5d4a-4b30-9146-1eabc336c647/l_oneapi_vtune_p_2023.1.0.44286_offline.sh
sudo sh ./l_oneapi_vtune_p_2023.1.0.44286_offline.sh

如果有图形界面就会自动启动图形界面,否则就是在终端中安装。为了方便,我在安装中使用了默认的安装路经,安装比较简单,其它的安装方法见:https://www.intel.com/content/www/us/en/docs/vtune-profiler/installation-guide/2023-0/linux.html

2.3 测试

打开一个终端,(如果是默认安装路径)`

source /opt/intel/oneapi/setvars.sh # 后续可以把这个命令加到~/.bashrc中
# 查看是否可以正常打开vtune-gui或vtune
vtune-gui
# 或者运行无图形界面的vtune

2.4 检查

VTune有一些功能需要一些软硬件支持,可以提前检查一下

cd /opt/intel/oneapi/vtune/latest
python3 ./bin64/self_check.py

运行记录

Intel(R) VTune(TM) Profiler Self Check Utility
Copyright (C) 2009 Intel Corporation. All rights reserved.
Build Number: 625246HW event-based analysis (counting mode)   
Example of analysis types: Performance SnapshotCollection: OkFinalization: Ok...Report: OkInstrumentation based analysis check   
Example of analysis types: Hotspots and Threading with user-mode samplingCollection: Fail
vtune: Error: Cannot start data collection because the scope of ptrace system call is limited. To enable profiling, please set /proc/sys/kernel/yama/ptrace_scope to 0. To make this change permanent, set kernel.yama.ptrace_scope to 0 in /etc/sysctl.d/10-ptrace.conf and reboot the machine.
vtune: Warning: Microarchitecture performance insights will not be available. Make sure the sampling driver is installed and enabled on your system.HW event-based analysis check   
Example of analysis types: Hotspots with HW event-based sampling, HPC Performance Characterization, etc.Collection: Fail
vtune: Error: This analysis requires one of these actions: a) Install Intel Sampling Drivers. b) Configure driverless collection with Perf system-wide profiling. To enable Perf system-wide profiling, set /proc/sys/kernel/perf_event_paranoid to 1 or set up Perf tool capabilities.
vtune: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel module symbols.HW event-based analysis check   
Example of analysis types: Microarchitecture ExplorationCollection: Fail
vtune: Error: This analysis requires one of these actions: a) Install Intel Sampling Drivers. b) Configure driverless collection with Perf system-wide profiling. To enable Perf system-wide profiling, set /proc/sys/kernel/perf_event_paranoid to 0 or set up Perf tool capabilities.
vtune: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel module symbols.HW event-based analysis with uncore events   
Example of analysis types: Memory AccessCollection: Fail
vtune: Error: Cannot collect memory bandwidth data. Make sure the sampling driver is installed and enabled on your system. See the Sampling Drivers help topic for more details. Note that memory bandwidth collection is not possible if you are profiling inside a virtualized environment.HW event-based analysis with stacks   
Example of analysis types: Hotspots with HW event-based sampling and call stacksCollection: Fail
vtune: Error: To run this analysis, do one of the following:* Set the Stack size option to the unlimited value (0 in command line).* Provide access to the performance events system with the /proc/sys/kernel/perf_event_paranoid value set to 2 or lower.
You can also configure driverless collection using Perf tool capabilities.
vtune: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel module symbols.
vtune: Error: Unlimited stack size (0) not allowed in driverless mode.HW event-based analysis with context switches   
Example of analysis types: Threading with HW event-based samplingCollection: Fail
vtune: Error: This analysis requires one of these actions: a) Install Intel Sampling Drivers. b) Configure driverless collection with Perf system-wide profiling. To enable Perf system-wide profiling, set /proc/sys/kernel/perf_event_paranoid to 1 or set up Perf tool capabilities.
vtune: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel module symbols.
vtune: Warning: Context switch data cannot be collected in the current driverless mode if the kernel version is less than 4.3 or /proc/sys/kernel/perf_event_paranoid value is greater than 1. Update your system configuration for  or consider switching to the Intel sampling driver by setting an unlimited (0) value for the Stack size option.vtune: Warning: VTune Profiler driver with insufficient permission is detected on the system.
vtune: Warning: Consider setting proper driver permissions (see the "Sampling Drivers" help topic).
vtune: Warning: Otherwise, the driverless collection with limited analysis support will be enabled by default.Checking DPC++ application as prerequisite for GPU analyses: Fail
Unable to run DPC++ application on GPU connected to this system. If you are using an Intel GPU and want to verify profiling support for DPC++ applications, check these requirements:
* Install Intel(R) GPU driver.
* Install Intel(R) Level Zero GPU runtime.
* Install Intel(R) oneAPI DPC++ Runtime and set the environment.The check observed a product failure on your system.
Review errors in the output above to fix a problem or contact Intel technical support.The system is ready for the following analyses:
* Performance SnapshotThe following analyses have failed on the system:
* Hotspots and Threading with user-mode sampling
* Hotspots with HW event-based sampling, HPC Performance Characterization, etc.
* Microarchitecture Exploration
* Memory Access
* Hotspots with HW event-based sampling and call stacks
* Threading with HW event-based sampling
* GPU Compute/Media Hotspots (characterization mode)
* GPU Compute/Media Hotspots (source analysis mode)Log location: /tmp/vtune-tmp-dell/self-checker-2023.07.18_02.16.19/log.txt

2.5 部分功能开启

2.5.1 ptrace

# ptrace
sudo vim /etc/sysctl.d/10-ptrace.conf # 修改值为0
sudo sysctl --system -a -p | grep yama # 应用配置,或者也可以选择重启电脑

2.5.2 Sampling Drivers

见第三章.

2.6 Memory Access功能

如果要使用Memory Access功能,需要安装Sampling Drivers,否则会报错(未存截图)。

三、安装Sampling Drivers

3.1 Sampling Drivers下载

有一个(文档),它里面说本地有驱动的源码。

$ ls /opt/intel/oneapi/vtune/latest/sepdk
include  src  vtune-layer

如果本地没有,网上有一个压缩包版本的,下载地址,下载之后解压到对应文件夹(/opt/intel/oneapi/vtune/latest/sepdk)即可。

sudo mkdir -p /opt/intel/oneapi/vtune/latest/sepdk
tar zxvf sepdk.tar.gz -C /opt/intel/oneapi/vtune/latest/sepdk

3.2 Sampling Drivers编译

参考

$ cd /opt/intel/oneapi/vtune/latest/sepdk/src
$ sudo ./build-driver
....
************ Built drivers are copied to /opt/intel/oneapi/vtune/2023.1.0/sepdk/src/socwatch/drivers directory ************
Done
Done building the drivers

3.3 Sampling Drivers安装

cd /opt/intel/oneapi/vtune/latest/sepdk/src
sudo ./insmod-sep -r -g sudo

其中,-g参数是用于指定用户组,这里指定了sudo用户组。getent group sudo命令可以查看sudo用户组的各个用户。

3.4 Sampling Drivers开机启动

cd /opt/intel/oneapi/vtune/latest/sepdk/src
sudo ./boot-script --install -g sudo

3.5 测试

3.5.1 [可选] 图形化界面(查看Memory Access功能)

vtune-gui

新建项目,选择Memory Access,完成后的截图:
在这里插入图片描述

3.5.2 重新检查功能

cd /opt/intel/oneapi/vtune/latest
python3 ./bin64/self_check.py

运行记录如下,可以看到已经很多模块是可以使用了(除了GPU的)

The system is ready for the following analyses:
* Performance Snapshot
* Hotspots and Threading with user-mode sampling
* Hotspots with HW event-based sampling, HPC Performance Characterization, etc.
* Microarchitecture Exploration
* Memory Access
* Hotspots with HW event-based sampling and call stacks
* Threading with HW event-based samplingThe following analyses have failed on the system:
* GPU Compute/Media Hotspots (characterization mode)
* GPU Compute/Media Hotspots (source analysis mode)

四、远程 VTune Profiler

4.1 准备工作

4.1.1 安装VTune(本地和远程)

本地需要打开Intel VTune软件,因此需要安装VTune(但是应该不需要安装驱动这些吧,没试)
远程需要运行Intel VTune软件,因此也需要安装VTune
具体安装方法和前面的一样。
如果远端服务器未配置好(或者ip和端口没指定好),会报错

Please, check that the command '/opt/intel/oneapi/vtune/latest/bin64/amplxe-runss -V' is run successfully on the target.

4.1.2 配置SSH免密登陆

方法之一(其它方法略)

ssh-copy-id user@ip -p port

4.1.2 尝试连接

如下图,
1、设置ip,user,port,注意这里的格式是user@ip:port
2、指定目录
3、指定应用程序
截图:
在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/522924.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

03_JDBC

文章目录 数据库的访问流程JDBCJDBC实现流程使用JDBC进行增删改查增删改查 重要的APIDriverManagerConnectionStatementResultSet JDBC实现流程的优化数据库注入问题批处理for循环逐条插入statement批处理preparedStatement批处理 数据库的事务事务的步骤事务的API事务的特性事…

计算机组成原理之机器:存储器之高速缓冲存储器

计算机组成原理之机器:存储器之高速缓冲存储器 笔记来源:哈尔滨工业大学计算机组成原理(哈工大刘宏伟) Chapter3:存储器之高速缓冲存储器 3.1 概述 3.1.1 为什么用cache? 角度一:I/O设备向…

Layer1 隐私安全项目 Partisia Blockchain 空投计划邀你瓜分 2500W 枚 MPC 奖励!

🛰️ Partisia 及基金会介绍 作为一个以 Web3 安全为技术方向的 Layer1 区块链,Partisia Blockchain 自 2021 年诞生之日起已完成了 3 项主要的技术创新。为了创建更安全、快速的数字基础设施,Partisia Blockchain 实现了 1 秒以内的快速交易…

macos docker baota 宝塔 搭建 ,新增端口映射

拉取镜像仅拉取镜像保存到本地,不部署容器,仅需拉取一次,永久存储到本地镜像列表 docker pull akaishuichi/baota-m1:lnmp 其他可参考:宝塔面板7.9.2docker镜像发布-集成LN/AMP支持m1/m2 mac版本 - Linux面板 - 宝塔面板论坛 运行…

CDN(内容分发网络):加速网站加载与优化用户体验

🤍 前端开发工程师、技术日更博主、已过CET6 🍨 阿珊和她的猫_CSDN博客专家、23年度博客之星前端领域TOP1 🕠 牛客高级专题作者、打造专栏《前端面试必备》 、《2024面试高频手撕题》 🍚 蓝桥云课签约作者、上架课程《Vue.js 和 E…

Linux服务器安装jdk

背景: 安装JDK是我们java程序在服务器运行的必要条件,下面描述几个简单的命令就可再服务器上成功安装jdk 命令总览: yum update -y yum list | grep jdk yum -y install java-1.8.0-openjdk java -version 1.查看可安装版本 yum list | grep jdk 2.如果查不到可先进行 yum upd…

VBA中类的解读及应用第十讲:限制文本框的输入,使其只能输入数值(上)

《VBA中类的解读及应用》教程【10165646】是我推出的第五套教程,目前已经是第一版修订了。这套教程定位于最高级,是学完初级,中级后的教程。 类,是非常抽象的,更具研究的价值。随着我们学习、应用VBA的深入&#xff0…

供应链管理系统(SCM):得供应链得天下不是空话。

2023-08-26 15:51贝格前端工场 Hi,我是贝格前端工场,优化升级各类管理系统的界面和体验,是我们核心业务之一,欢迎老铁们评论点赞互动,有需求可以私信我们 一、供应链对于企业的重要性 供应链对企业经营的重要性不可…

二叉树入门

这篇博客通过手动创建的一个简单二叉树,实现二叉树遍历,返回节点,叶子个数,查找结点等相关操作。 1. 二叉树的概念 二叉树不为空时,由根节点,左/右子树组成,逻辑结构如下,当二叉树…

智慧农业5G融合方案(2)

应用场景 农业航空 农业农村部资料显示,植保无人机具有机动灵活、喷施效率高、施药效果好等特点,能够克服复杂地形条件下地面喷雾机具进地难的问题。目前我国有400多家企业从事植保无人机研发、生产、销售等全产业链业务。主要机型以电动多旋翼为主。2018年作业面积约2.67亿…

sign加密方法生成

1. 引入包的问题 2. 原因 .pycrypto、pycrytodome和crypto是一个东西,crypto在python上面的名字是pycrypto,它是一个第三方库,但是已经停止更新 3. 解决方法 --直接安装:pip install pycryptodome 3.但是,在使用的时…

日韩媒体宣传案例分析:CloudNEO 为您提供海外媒体宣传最佳途径

近年来,随着互联网的迅速发展和全球化的加速推进,海外市场对于企业的重要性日益凸显。尤其是在亚洲地区,日本和韩国作为亚洲最具活力和潜力的市场之一,成为众多企业争相开拓的目标。在这个过程中,媒体宣传不仅是企业推…