电脑本地部署deepseek教程

news/2025/2/5 13:36:28/文章来源:https://www.cnblogs.com/king8/p/18699214

Ollama:本地大模型运行指南

14,445 阅读11分钟
智能总结
复制
重新生成

这篇文章主要介绍了本地大模型运行框架 Ollama。包括其简介、安装方法(下载安装及大模型下载)、终端对话(如显示帮助、模型信息等命令)、API 调用(generate 和 chat 两种方式)、Web UI 及相关参考资料。支持多种模型,可本地搭建可视化页面,学习成本低。

关联问题: 如何安装其他模型API调用有何限制Web UI哪个好用

本文作者为 360 奇舞团前端开发工程师 杨鹏

Ollama 简介

Ollama 是一个基于 Go 语言开发的可以本地运行大模型的开源框架。

官网:ollama.com/

GitHub 地址:github.com/ollama/olla…

Ollama 安装

下载安装 Ollama

在 Ollama 官网根据操作系统类型选择对应的安装包,这里选择 macOS 下载安装。 Ollama下载 安装完在终端输入 ollama,可以看到 ollama 支持的命令。

bash
代码解读
复制代码
Usage: ollama [flags] ollama [command] Available Commands: serve Start ollama create Create a model from a Modelfile show Show information for a model run Run a model pull Pull a model from a registry push Push a model to a registry list List models cp Copy a model rm Remove a model help Help about any command Flags: -h, --help help for ollama -v, --version Show version information Use "ollama [command] --help" for more information about a command.

查看 ollama 版本

bash
代码解读
复制代码
ollama -v ollama version is 0.1.31

查看已下载模型

bash
代码解读
复制代码
ollama list NAME ID SIZE MODIFIED gemma:2b b50d6c999e59 1.7 GB 3 hours ago

我本地已经有一个大模型,接下来我们看一下怎么下载大模型。

下载大模型

下载模型

安装完后默认提示安装 llama2 大模型,下面是 Ollama 支持的部分模型

ModelParametersSizeDownload
Llama 38B4.7GBollama run llama3
Llama 370B40GBollama run llama3:70b
Mistral7B4.1GBollama run mistral
Dolphin Phi2.7B1.6GBollama run dolphin-phi
Phi-22.7B1.7GBollama run phi
Neural Chat7B4.1GBollama run neural-chat
Starling7B4.1GBollama run starling-lm
Code Llama7B3.8GBollama run codellama
Llama 2 Uncensored7B3.8GBollama run llama2-uncensored
Llama 2 13B13B7.3GBollama run llama2:13b
Llama 2 70B70B39GBollama run llama2:70b
Orca Mini3B1.9GBollama run orca-mini
LLaVA7B4.5GBollama run llava
Gemma2B1.4GBollama run gemma:2b
Gemma7B4.8GBollama run gemma:7b
Solar10.7B6.1GBollama run solar

Llama 3 是 Meta 2024年4月19日 开源的大语言模型,共80亿和700亿参数两个版本,Ollama均已支持。

这里选择安装 gemma 2b,打开终端,执行下面命令:

shell
代码解读
复制代码
ollama run gemma:2b
bash
代码解读
复制代码
pulling manifest pulling c1864a5eb193... 100% ▕██████████████████████████████████████████████████████████▏ 1.7 GB pulling 097a36493f71... 100% ▕██████████████████████████████████████████████████████████▏ 8.4 KB pulling 109037bec39c... 100% ▕██████████████████████████████████████████████████████████▏ 136 B pulling 22a838ceb7fb... 100% ▕██████████████████████████████████████████████████████████▏ 84 B pulling 887433b89a90... 100% ▕██████████████████████████████████████████████████████████▏ 483 B verifying sha256 digest writing manifest removing any unused layers success

经过一段时间等待,显示模型下载完成。

上表仅是 Ollama 支持的部分模型,更多模型可以在 ollama.com/library 查看,中文模型比如阿里的通义千问。

终端对话

下载完成后,可以直接在终端进行对话,比如提问“介绍一下React”

bash
代码解读
复制代码
>>> 介绍一下React

输出内容如下:

显示帮助命令-/?

markdown
代码解读
复制代码
>>> /? Available Commands: /set Set session variables /show Show model information /load <model> Load a session or model /save <model> Save your current session /bye Exit /?, /help Help for a command /? shortcuts Help for keyboard shortcuts Use """ to begin a multi-line message.

显示模型信息命令-/show

bash
代码解读
复制代码
>>> /show Available Commands: /show info Show details for this model /show license Show model license /show modelfile Show Modelfile for this model /show parameters Show parameters for this model /show system Show system message /show template Show prompt template

显示模型详情命令-/show info

bash
代码解读
复制代码
>>> /show info Model details: Family gemma Parameter Size 3B Quantization Level Q4_0

API 调用

除了在终端直接对话外,ollama 还可以以 API 的方式调用,比如执行 ollama show --help 可以看到本地访问地址为:http://localhost:11434

bash
代码解读
复制代码
ollama show --help Show information for a model Usage: ollama show MODEL [flags] Flags: -h, --help help for show --license Show license of a model --modelfile Show Modelfile of a model --parameters Show parameters of a model --system Show system message of a model --template Show template of a model Environment Variables: OLLAMA_HOST The host:port or base URL of the Ollama server (e.g. http://localhost:11434)

下面介绍主要介绍两个 api :generate 和 chat。

generate

  • 流式返回
json
代码解读
复制代码
curl http://localhost:11434/api/generate -d '{ "model": "gemma:2b", "prompt":"介绍一下React,20字以内" }'
json
代码解读
复制代码
{"model":"gemma:2b","created_at":"2024-04-19T10:12:32.337192Z","response":"React","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.421481Z","response":" 是","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.503852Z","response":"一个","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.584813Z","response":"用于","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.672575Z","response":"构建","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.754663Z","response":"用户","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.837639Z","response":"界面","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.918767Z","response":"(","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:32.998863Z","response":"UI","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.080361Z","response":")","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.160418Z","response":"的","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.239247Z","response":" JavaScript","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.318396Z","response":" 库","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.484203Z","response":"。","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.671075Z","response":"它","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.751622Z","response":"允许","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.833298Z","response":"开发者","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:33.919385Z","response":"轻松","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.007706Z","response":"构建","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.09201Z","response":"可","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.174897Z","response":"重","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.414743Z","response":"用的","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.497013Z","response":" UI","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.584026Z","response":",","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.669825Z","response":"并","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.749524Z","response":"与","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.837544Z","response":"各种","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:34.927049Z","response":" JavaScript","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.008527Z","response":" ","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.088936Z","response":"框架","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.176094Z","response":"一起","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.255251Z","response":"使用","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.34085Z","response":"。","done":false} {"model":"gemma:2b","created_at":"2024-04-19T10:12:35.428575Z","response":"","done":true,"context":[106,1645,108,25661,18071,22469,235365,235284,235276,235960,179621,107,108,106,2516,108,22469,23437,5121,40163,81964,16464,57881,235538,5639,235536,235370,22978,185852,235362,236380,64032,227725,64727,81964,235553,235846,37694,13566,235365,236203,235971,34384,22978,235248,90141,19600,7060,235362,107,108],"total_duration":3172809302,"load_duration":983863,"prompt_eval_duration":80181000,"eval_count":34,"eval_duration":3090973000}
  • 非流式返回

通过设置 "stream": false 参数可以设置一次性返回。

``bash curl http://localhost:11434/api/generate -d '{ "model": "gemma:2b", "prompt":"介绍一下React,20字以内", "stream": false }'

json
代码解读
复制代码
```json { "model": "gemma:2b", "created_at": "2024-04-19T08:53:14.534085Z", "response": "React 是一个用于构建用户界面的大型 JavaScript 库,允许您轻松创建动态的网站和应用程序。", "done": true, "context": [106, 1645, 108, 25661, 18071, 22469, 235365, 235284, 235276, 235960, 179621, 107, 108, 106, 2516, 108, 22469, 23437, 5121, 40163, 81964, 16464, 236074, 26546, 66240, 22978, 185852, 235365, 64032, 236552, 64727, 22957, 80376, 235370, 37188, 235581, 79826, 235362, 107, 108], "total_duration": 1864443127, "load_duration": 2426249, "prompt_eval_duration": 101635000, "eval_count": 23, "eval_duration": 1757523000 }

chat

  • 流式返回
bash
代码解读
复制代码
curl http://localhost:11434/api/chat -d '{ "model": "gemma:2b", "messages": [ { "role": "user", "content": "介绍一下React,20字以内" } ] }'

可以看到终端输出结果:

bash
代码解读
复制代码
{"model":"gemma:2b","created_at":"2024-04-19T08:45:54.86791Z","message":{"role":"assistant","content":"React"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:54.949168Z","message":{"role":"assistant","content":"是"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.034272Z","message":{"role":"assistant","content":"用于"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.119119Z","message":{"role":"assistant","content":"构建"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.201837Z","message":{"role":"assistant","content":"用户"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.286611Z","message":{"role":"assistant","content":"界面"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.37054Z","message":{"role":"assistant","content":" React"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.45099Z","message":{"role":"assistant","content":"."},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.534105Z","message":{"role":"assistant","content":"js"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.612744Z","message":{"role":"assistant","content":"框架"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.695129Z","message":{"role":"assistant","content":","},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.775357Z","message":{"role":"assistant","content":"允许"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.855803Z","message":{"role":"assistant","content":"开发者"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:55.936518Z","message":{"role":"assistant","content":"轻松"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.012203Z","message":{"role":"assistant","content":"地"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.098045Z","message":{"role":"assistant","content":"创建"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.178332Z","message":{"role":"assistant","content":"动态"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.255488Z","message":{"role":"assistant","content":"网页"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.336361Z","message":{"role":"assistant","content":"。"},"done":false} {"model":"gemma:2b","created_at":"2024-04-19T08:45:56.415904Z","message":{"role":"assistant","content":""},"done":true,"total_duration":2057551864,"load_duration":568391,"prompt_eval_count":11,"prompt_eval_duration":506238000,"eval_count":20,"eval_duration":1547724000}

默认流式返回,同样可以通过 "stream": false 参数一次性返回。

generate 和 chat 的区别在于,generate 是一次性生成的数据。chat 可以附加历史记录,多轮对话。

Web UI

除了上面终端和 API 调用的方式,目前还有许多开源的 Web UI,可以本地搭建一个可视化的页面来实现对话,比如:

  • open-webui

github.com/open-webui/…

  • lollms-webui

github.com/ParisNeo/lo…

通过 Ollama 本地运行大模型的学习成本已经非常低,大家有兴趣尝试本地部署一个大模型吧 🎉🎉🎉

参考资料

ollama.com/ llama.meta.com/llama3/ github.com/ollama/olla… dev.to/wydoinn/run…

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/879060.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

可能是对春节假期的一些总结

写在前面2603 字 | 总结 | 经历 | 思考 | 感触未经允许,禁止转载。 正文「我一直都应该知道,听到别人说出她的名字,我还是会心里一跳。」过年与我当初预计的完全不同。我当初的计划,春节七八天,写很多很多文章,极速推进今年年度计划的写作一项。为此还专门订阅了一个月的…

搜狗录音笔c1折腾

1.买的pdd厂家帮我整好了大部分(应该刷过固件) 2.遇到的问题就是c to c的线...它没有协商,得插电脑usb-a口上才能识别到 3.厂家提供的搜狗录音笔助手这个软件是有点小处理的不是原软件(指解锁登录,其他的没看出问题,至少软件没报毒)

task3

任务二:Smiling-Weeping-zhr/Travel-assistant 自建项目链接,有关大模型关于大模型解答旅游相关

1.4 条件分支和循环机制

程序的流程分为顺序执行、条件分支和循环三种 顺序执行是按照地址内容的顺序执行指令。 条件分支是根据条件执行任意地址的指令。 循环是指重复执行同一地址的指令。 顺序执行每执行一个指令程序计数器的值自动加1条件分支和循环中使用的跳转指令,会参照当前执行的运算结果来判…

vue3使用flv.js播放flv直播流

目前有个需求是:管理直播机有一个列表需要查看每个直播机的实时内容,所以需要在后台加这个功能。 效果:我用ffmpeg模拟推流: 如何用ffmpeg模拟推流请看我上一篇文章 网页: 如上是可以正确再网页端拉流,这个功能费了一天的时间,在这里免费分享出来。 首先安装flv.js(用n…

DeepSeep本地部署

1、浏览器搜索引擎更改 2、搜索:Ollama操作下载: 也可直接进入下载路径:https://ollama.com/download/windows ollama run deepseek-r1 本文来自博客园,作者:他还在坚持嘛,转载请注明原文链接:他还在坚持嘛 https://www.cnblogs.com/brf-test/p/18699050

Linux下使用df与du命令查看磁盘空间

1、df磁盘空间查看df可以查看一级文件夹大小、使用比例、档案系统及其挂入点。[root@oms ~]# df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/vda1 ext4 40G 35G 3.1G 92% / devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev tmp…

JOKER智能可视化平台 20250204版本更新说明

本次 JOKER 低代码平台更新涵盖了代码生成、环境变量、可视化开发工具等多个关键领域的优化与新增功能,致力于为开发者提供更高效、更安全、更便捷的开发体验。同时,服务端功能的正式发布以及核心升级,进一步增强了平台的整体性能和竞争力。 一、功能更新与优化 (一)代码生…

城市智慧升级:超算与智算的协同效应

随着数字化转型的不断深入,城市高质量发展越来越依赖于强大的算力支持。结合超级计算(超算)和智能计算(智算)的“超算+智算”模式,正在成为推动城市创新和智能化发展的重要力量。超算与智算的结合超级计算通常指的是使用最先进的计算机硬件和软件进行大规模科学计算的技术…

linux安装适用glibc2.17的nodejs高版本

从nodejs18开始需要的glibc版本要>2.28,这对有些系统是不支持的,除了官方的下载地址外,还有个地址可以下载适用2.17的nodejs高版本。 https://unofficial-builds.nodejs.org/download/release/ 对于linux平台直接下载解压使用即可。本帖子也是纯手工制作,转载请标明出处…

安卓苹果手机通用的便签备忘录APP哪个好用?

每次换手机最头疼的不是适应新系统,而是发现旧手机里800条备忘和灵感,全都困在品牌的生态孤岛里! 无论是小米换iPhone还是华为跳三星,系统自带的备忘录就像「前任的日记本」,永远无法在新欢身上打开。 这时候你就需要一款真正打破系统壁垒的「跨平台同步便签/备忘录APP」!…