【Tensorflow2.x】tensorflow-gpu 在 Ubuntu 上的安装

        好几次遇到问为什么安装的 tensorflow 不能调用GPU,之前搞定过几次,前两天又有人问,又捣鼓了很久才搞定,这里简单记录一下我遇到的问题,以及解决方案。

一、安装方法

(一)安装并更新 conda

1.安装 conda

        安装 conda 很重要,使用 pip 安装 tensorflow-gpu 太多问题了(这里默认已经安装了conda)。

2.更新 conda

conda update -n base -c defaults conda --repodata-fn=repodata.json

         之前根据百度,都是执行:

conda update -n base -c defaults conda

        然后,首先该命令无法更新到最新的 conda;其次,我们在使用 conda -V 查看版本时,conda 版本显示错误。

        该解决方案来自于 GitHub:I got update warning message but unable to update · Issue #12519 · conda/conda · GitHubicon-default.png?t=N658https://github.com/conda/conda/issues/12519

        将 conda 的 base 更新到最新,我觉得原因是能够同步最新的包依赖关系,过时的版本可能导致依赖出问题。

(二)创建环境

1.创建环境

conda create -n TensorFlow2.4 python=3.9

         当然,这里可以根据自己的 CUDA 版本选择对应的 tensorflow 版本,我的 CUDA 版本为 11.3 :

(Tensorflow2.4) name@eclab:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0

         不显示的话,可以自行进一步搜索为什么 nvcc -V 不显示:

Ubuntu20.04LTS系统CUDA已经安装但nvcc -V显示command not found_nvcc -v 提示未找到命令_AISecurity盐究员的博客-CSDN博客安装了NVIDIA驱动程序,同时也安装了CUDA,但使用nvcc -V使用nvcc -V命令可以查看CUDA的版本,如下所示为正常的输入、输出内容,可以看出通过nvcc -V命令,可以看到目前所使用的CUDA版本。_nvcc -v 提示未找到命令https://blog.csdn.net/m0_38068876/article/details/127836484        注 .bashrc 文件添加环境变量时,需要根据 /usr/local/ 下的 cuda实际情况进行修改,这里展示我的情况:

(Tensorflow2.4) name@eclab:~$ cd /usr/local/
(Tensorflow2.4) name@eclab:/usr/local$ ls
bin   cuda-11.3  games    lib  sbin   src
cuda  etc        include  man  share  sunlogin

 这里有 cuda 软链接,链接到 cuda-11.3,所以建议使用下面命令:

# cuda-11.3
export PATH=/usr/local/cuda-11.3/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.3/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

(三)安装 tensorflow-gpu

1.安装

        激活环境:

conda activate TensorFlow2.4

         安装 tensorflow-gpu:

conda install tensorflow-gpu

         注:不要使用pip安装!不要使用pip安装!不要使用pip安装!

         这里没有选择 tensorflow-gpu 版本,conda 自动下载了 tensorflow-gpu==2.4.1 (版本对应可以查看 Build from source  |  TensorFlow)。

2.测试

        执行如下两个命令即可:

(Tensorflow2.4) name@eclab:/usr/local$ python
Python 3.9.17 (main, Jul  5 2023, 20:41:20)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-07-10 10:22:16.571135: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
>>> tf.config.list_physical_devices('GPU')
2023-07-10 10:22:27.565493: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2023-07-10 10:22:27.567453: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2023-07-10 10:22:27.611185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.612680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 1 with properties:
pciBusID: 0000:03:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.613857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 2 with properties:
pciBusID: 0000:82:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.614783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 3 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:22:27.614821: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2023-07-10 10:22:27.617316: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2023-07-10 10:22:27.617370: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2023-07-10 10:22:27.619509: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2023-07-10 10:22:27.619882: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2023-07-10 10:22:27.622449: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2023-07-10 10:22:27.623913: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2023-07-10 10:22:27.629319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2023-07-10 10:22:27.644606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0, 1, 2, 3
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]

(四)使用 pip 安装的问题

         下面演示使用 pip 安装的话存在的问题。

(base) name@eclab:~$ conda create -n Tensorflow-err python=3.9(Tensorflow-err) name@eclab:~$ pip install tensorflow-gpu==2.4.1
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==2.4.1 (from versions: 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0, 2.12.0)
ERROR: No matching distribution found for tensorflow-gpu==2.4.1

        首先安装不了2.4.1,根据提示,选择安装2.5.0;

pip install tensorflow-gpu==2.5

        使用步骤(三)中的 2.测试方法:

(Tensorflow-err) name@eclab:~$ python
Python 3.9.17 (main, Jul  5 2023, 20:41:20)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-07-10 10:38:37.238756: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
>>> tf.config.list_physical_devices('GPU')
2023-07-10 10:38:40.413250: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-07-10 10:38:40.456066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.457549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:03:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.458707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 2 with properties:
pciBusID: 0000:82:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.459651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 3 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2023-07-10 10:38:40.459700: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-07-10 10:38:40.464266: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2023-07-10 10:38:40.464333: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2023-07-10 10:38:40.465775: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2023-07-10 10:38:40.466117: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2023-07-10 10:38:40.467045: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2023-07-10 10:38:40.468303: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2023-07-10 10:38:40.468555: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.3/lib64
2023-07-10 10:38:40.468578: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

        错误内容:

tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.3/lib64

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/15670.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

第五步:STM32F4端口复用

什么是端口复用? STM32有很多的内置外设,这些外设的外部引脚都是与GPIO复用的。也就是说,一个GPIO如果可以复用为内置外设的功能引脚,那么当这个GPIO作为内置外设使用的时候,就叫做复用。 例如串口 1 的发送接收引脚…

【现场问题】flink-cdc,Oracle2Mysql的坑,Oracle区分大小写导致

大小写导致的问题 错误的flink-cdc语句sql我们看一下oracle的数据库字段再看一下错误sql里面的内容flink报错内容 正确的sql三级目录 错误的flink-cdc语句sql CREATE TABLE t_wx_source_1 (id String,name String,age String ) WITH (connector oracle-cdc,hostname 192.168…

【网络】UDP协议详解

目录 UDP的感性理解 UDP协议格式 UDP协议格式感性理解 UDP特点 UDP的缓冲区 UDP的感性理解 UDP的传输过程类似于寄信,假设你要写一封家书寄回家里:首先你要在信封上填写好寄件人和收件人的地址,其次在贴好邮票,最后将信件投放…

36.RocketMQ之Broker如何实现磁盘文件高性能读写

highlight: arduino-light Broker读写磁盘文件的核心技术:mmap Broker中大量的使用mmap技术去实现CommitLog这种大磁盘文件的高性能读写优化的。 通过之前的学习,我们知道了一点,就是Broker对磁盘文件的写入主要是借助直接写入os cache来实现性能优化的&…

论文笔记--SentEval: An Evaluation Toolkit for Universal Sentence Representations

论文笔记--SentEval: An Evaluation Toolkit for Universal Sentence Representations 1. 文章简介2. 文章概括3 文章重点技术3.1 evaluation pipeline3.2 使用 4. 代码4.1 数据下载4.2 句子嵌入4.3 句子嵌入评估 5. 文章亮点6. 原文传送门7. References 1. 文章简介 标题&…

钉钉聊天对话框和截图经常发生白屏

环境: 7.0.30-rel6019102 Win10专业版 L盾加密环境 问题描述: 钉钉聊天对话框和截图经常发生白屏 解决方案: 1.【电脑端钉钉】- 左上角【头像】-【设置】-【高级】- 下拉【网络检测】- 点击【开始检测】 如果变红说明网络有问题&#x…

Moka AI产品后观察:HR SaaS迈进AGI时代

在AI这条路上,Moka已经走了很远。如今的Moka Eva是在此前AI模型基础上的更进一步。未来AGI时代,HR SaaS会有更多可能性。 出品|产业家 在AI潮水里,Moka正在加速快跑。 在6月28日的2023夏季新品发布会上,国内首个AI原生HR Saa…

将一个3x3的OpenCV旋转矩阵转换为Eigen的Euler角

代码将一个3x3的OpenCV旋转矩阵转换为Eigen的Euler角。 #include <iostream> #include <Eigen/Core> #include <Eigen/Geometry> #include <opencv2/core.hpp>using

蓝桥杯刷题-1

文章目录 1.蓝桥杯官网2.蓝桥杯题目进入界面 及 题目详情3.题目解答过程及思路4.运行结果图5.解答代码展示6.ASCII表图例 大家好&#xff0c;我是晓星航。今天为大家带来的是 蓝桥杯刷题 - 1 -单词分析 相关的讲解&#xff01;&#x1f600; 1.蓝桥杯官网 题库 - 蓝桥云课 (l…

【数学建模】 灰色预测模型

数学建模——预测模型简介 https://www.cnblogs.com/somedayLi/p/9542835.html 灰色预测模型 https://blog.csdn.net/qq_39798423/article/details/89283000?ops_request_misc&request_id&biz_id102&utm_term%E7%81%B0%E8%89%B2%E9%A2%84%E6%B5%8B%E6%A8%…

SpringBoot的@ConfigurationProperties、@Autowired、@Conditional注解

目录 1. ConfigurationProperties EnableConfigurationProperties Autowired注解1.1 configuration自定义配置参数自动补全功能 2. Conditional注解 1. ConfigurationProperties EnableConfigurationProperties Autowired注解 在resources/application.properties文件中&a…

日撸java三百行day74

文章目录 说明通用BP神经网络之激活函数1. 激活函数2. 激活函数分类1.1 反正切函数&#xff08;ArcTan&#xff09;1.2 指数线性函数&#xff08;ELU&#xff09;1.3 恒等函数1.4 泄漏线性整流函数(LEAKY_RELU)1.5 softsign1.6 softplus1.7 Relu函数1.8 sigmoid函数1.9 双曲正切…