主要参考知乎帖子:
MiniGPT-4 本地部署 RTX 3090 - 知乎
MiniGPT-4部署比麻烦,首先需要获取LLaMA权重,并结合Vicuna的bitwise XOR增量文件完成Vicuna模型权重生成,最后准备好预训练的MiniGPT-4进行模型部署。为了便于理解,我画了个流程框图:
系统版本:Ubuntu 20.04
我的硬件设备:Nvidia GeForce RTX-3090,显存24G
1、准备环境
克隆MiniGPT-4库,准备environment.yml中所需的环境。
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigpt4
……
2、LLaMA权重获取
首先我们需要从huggingface下载模型权重,pip安装huggingface_hub。
pip install huggingface_hub
由于显卡限制,我选用了参数量最小的模型 llama-7b-hf,huggingface下载链接如下:
LLaMA:
decapoda-research (Decapoda Research)
本文选择:decapoda-research/llama-7b-hf
decapoda-research/llama-7b-hf at main
注意:文件需要全部下载,原文中是用snapshot_download下载的,我直接网页版下载,因为git容易断,还可能出现checkout失败,可以手动下载。
3、Vicuna增量文件
选用模型vicuna-7b-delta-v1.1,huggingface下载链接如下:
lmsys (Large Model Systems Organization)
lmsys/vicuna-7b-delta-v1.1 at main
注:vicuna权重分为v0和v1.1两个版本,MiniGPT-4作者采用的是v0,当使用v0版本时,生成vicuna权重出错(bug:tensor尺度不一致),而换为v1.1版本即可解决。我之前试用过v0这个版本,没有搞成功,不是上述原因,待查,所以建议选用v1.1版本。
4、Vicuna权重生成
克隆FastChat库:
git clone https://github.com/lm-sys/FastChat.git
GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
在终端输入以下命令:
python3 -m fastchat.model.apply_delta --base-model-path /home/train/mycharm/MiniGPT-4/model/llama-7b-hf/ --target-model-path /home/train/mycharm/new/vicuna --delta /home/train/mycharm/new/lmsys/lmsysvicuna-7b-delta-v1.1 --low-cpu-mem
说明:
base-model-path是llama原始模型权重,7b参数的,target-model-path 是要生成的vicuna权重,delta是vicuna delta权重,低CPU内存需加入--low-cpu-mem,可以把大的权重文件分割成多个小份,并使用磁盘作为临时存储。可以使峰值内存保持在16GB以下。不然无法载入vicuna增量文件,CPU内存占满,程序直接被kill,绿色表示已有的vicuna-7b-delta权重。
这行命令对初学者有些迷惑:简单说就是利用llama的权重,结合vicuna的delta权重,然后生成vicuna权重,根源在于meta发布的llama权重没有正式公开导致,能下载只是网络行为。
运行结果如下:
新生成的vicuna的权重在设定的目录中:
5、MiniGPT-4启动
本文采用的权重为原作者的checkpoints,prerained_minigpt4_7b.pth,并放在生成的vicuna权重路径下,目录一定要放对。
下载链接:
https://link.zhihu.com/?target=https%3A//drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view
此处要用谷歌,下载其他版本应该也可以,我没试。
下载完成后放到上述刚生成的vicuna目录下:
修改配置文件模型权重存放的目录:
下面这两步很关键,要修改权重文件的目录,要根据自己实际情况修改:
1)、修改MiniGPT-4/minigpt4/configs/models/minigpt4.yaml 文件中llama_model的值为vicuna-7b的路径:比如,我的在/home/train/mycharm/new/vicuna/在这个目录下,原文件在16行。
2)、修改MiniGPT-4/eval_configs/minigpt4_eval.yaml,将ckpt的值改成prerained_minigpt4_7b.pth的路径,原文件在11行。
比如我的在这个目录:/home/train/mycharm/new/vicuna/prerained_minigpt4_7b.pth
6、启动MiniGPT-4 demo
进入到MiniGPT-4目录:
python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0
执行成功。
里面有个警告,疑似pytorch和torchvision版本不一致导致,不影响此处功能,参照以下帖子:
Failed to load image Python extension: libtorch_cuda_cu.so_牧羊女说的博客-CSDN博客
下面是原作者的运行图:
附录:
虚拟环境中各个包的名称及版本
Package | Version |
absl-py | 1.4.0 |
accelerate | 0.15.0 |
addict | 2.4.0 |
aiofiles | 23.1.0 |
aiohttp | 3.8.4 |
aiosignal | 1.3.1 |
altair | 5.0.1 |
anyio | 3.7.1 |
appdirs | 1.4.4 |
apturl | 0.5.2 |
argon2-cffi | 21.3.0 |
argon2-cffi-bindings | 21.2.0 |
arrow | 1.2.3 |
asttokens | 2.2.1 |
async-timeout | 4.0.2 |
attrs | 23.1.0 |
autopep8 | 2.0.1 |
backcall | 0.1.0 |
bcrypt | 3.1.7 |
beautifulsoup4 | 4.12.2 |
bleach | 6.0.0 |
blinker | 1.4 |
Brlapi | 0.7.0 |
cachetools | 5.3.0 |
catfish | 1.4.13 |
certifi | 2019.11.28 |
cffi | 1.15.1 |
chardet | 3.0.4 |
charset-normalizer | 3.1.0 |
chrome-gnome-shell | 0.0.0 |
Click | 7 |
cmake | 3.25.2 |
colorama | 0.4.3 |
coloredlogs | 15.0.1 |
comm | 0.1.3 |
command-not-found | 0.3 |
configobj | 5.0.6 |
contourpy | 1.0.7 |
cryptography | 2.8 |
cuda | 0.0.1 |
cupshelpers | 1 |
cycler | 0.11.0 |
dbus-python | 1.2.16 |
debugpy | 1.6.7 |
decorator | 4.4.2 |
defer | 1.0.6 |
defusedxml | 0.7.1 |
distro | 1.4.0 |
distro-info | 0.23ubuntu1 |
docker-pycreds | 0.4.0 |
dulwich | 0.19.15 |
duplicity | 0.8.12.0 |
entrypoints | 0.3 |
exceptiongroup | 1.1.2 |
executing | 1.2.0 |
fairscale | 0.4.13 |
fastapi | 0.99.1 |
fastChat | 0.1.1 |
fasteners | 0.14.1 |
fastimport | 0.9.8 |
fastjsonschema | 2.17.1 |
ffmpy | 0.3.0 |
filelock | 3.12.2 |
fire | 0.5.0 |
flatbuffers | 23.1.21 |
fonttools | 4.38.0 |
fqdn | 1.5.1 |
frozenlist | 1.3.3 |
fschat | 0.2.18 |
fsspec | 2023.6.0 |
future | 0.18.2 |
gitdb | 4.0.10 |
GitPython | 3.1.31 |
google-auth | 2.16.1 |
google-auth-oauthlib | 0.4.6 |
gradio | 3.35.2 |
gradio_client | 0.2.7 |
graphviz | 0.8.4 |
grpcio | 1.51.1 |
h11 | 0.14.0 |
h5py | 3.8.0 |
hiq-python | 1.1.12 |
httpcore | 0.17.3 |
httplib2 | 0.14.0 |
httpx | 0.24.1 |
huggingface-hub | 0.16.3 |
humanfriendly | 10 |
idna | 2.8 |
imageio | 2.22.2 |
importlib-metadata | 6.0.0 |
importlib-resources | 5.12.0 |
ipykernel | 6.24.0 |
ipython | 8.12.2 |
ipython_genutils | 0.2.0 |
ipywidgets | 8.0.7 |
isoduration | 20.11.0 |
jedi | 0.18.2 |
Jinja2 | 3.1.2 |
joblib | 1.2.0 |
jsonpatch | 1.32 |
jsonpointer | 2.3 |
jsonschema | 4.18.0 |
jsonschema-specifications | 2023.6.1 |
jupyter | 1.0.0 |
jupyter_client | 8.3.0 |
jupyter-console | 6.6.3 |
jupyter_core | 5.3.1 |
jupyter-events | 0.6.3 |
jupyter_server | 2.7.0 |
jupyter_server_terminals | 0.4.4 |
jupyterlab-pygments | 0.2.2 |
jupyterlab-widgets | 3.0.8 |
keyring | 18.0.1 |
kiwisolver | 1.4.4 |
labelImg | 1.8.6 |
language-selector | 0.1 |
launchpadlib | 1.10.13 |
lazr.restfulclient | 0.14.2 |
lazr.uri | 1.0.3 |
lightdm-gtk-greeter-settings | 1.2.2 |
linkify-it-py | 2.0.2 |
lit | 15.0.7 |
lockfile | 0.12.2 |
louis | 3.12.0 |
lxml | 4.6.2 |
macaroonbakery | 1.3.1 |
Mako | 1.1.0 |
Markdown | 3.4.1 |
markdown-it-py | 2.2.0 |
markdown2 | 2.4.9 |
MarkupSafe | 2.1.2 |
matplotlib | 3.6.3 |
matplotlib-inline | 0.1.6 |
mdit-py-plugins | 0.3.3 |
mdurl | 0.1.2 |
meld | 3.20.2 |
menulibre | 2.2.1 |
mistune | 3.0.1 |
mmcv | 1.7.1 |
mmdet | 2.28.1 |
monotonic | 1.5 |
mpmath | 1.2.1 |
mugshot | 0.4.2 |
multidict | 6.0.4 |
mxnet | 1.9.1 |
nbclassic | 1.0.0 |
nbclient | 0.8.0 |
nbconvert | 7.6.0 |
nbformat | 5.9.0 |
nest-asyncio | 1.5.6 |
netifaces | 0.10.4 |
networkx | 2.8.7 |
nh3 | 0.2.14 |
notebook | 6.5.4 |
notebook_shim | 0.2.3 |
numpy | 1.24.4 |
oauthlib | 3.1.0 |
olefile | 0.46 |
onboard | 1.4.1 |
onnx | 1.13.0 |
onnxruntime | 1.14.0 |
opencv-python | 4.6.0.66 |
orjson | 3.9.1 |
overrides | 7.3.1 |
packaging | 21.3 |
pandas | 1.5.3 |
pandocfilters | 1.5.0 |
paramiko | 2.6.0 |
parso | 0.8.3 |
pathtools | 0.1.2 |
peft | 0.3.0 |
pexpect | 4.6.0 |
pickleshare | 0.7.5 |
Pillow | 9.0.0 |
pip | 23.1.2 |
pkgutil_resolve_name | 1.3.10 |
platformdirs | 3.8.1 |
prometheus-client | 0.17.0 |
prompt-toolkit | 3.0.39 |
protobuf | 3.19.0 |
psutil | 5.9.4 |
ptyprocess | 0.7.0 |
pure-eval | 0.2.2 |
py-itree | 0.0.19 |
pyasn1 | 0.4.8 |
pyasn1-modules | 0.2.8 |
pycairo | 1.16.2 |
pycocotools | 2.0.6 |
pycodestyle | 2.10.0 |
pycparser | 2.21 |
pycups | 1.9.73 |
pydantic | 1.10.11 |
pydub | 0.25.1 |
Pygments | 2.15.1 |
PyGObject | 3.36.0 |
PyJWT | 1.7.1 |
pyllama | 0.0.8 |
pymacaroons | 0.13.0 |
PyNaCl | 1.3.0 |
pyparsing | 3.0.9 |
PyQt5 | 5.10.1 |
PyQt5-Qt5 | 5.15.2 |
PyQt5-sip | 12.12.1 |
pyRFC3339 | 1.1 |
pysvn | 1.9.9 |
python-apt | 2.0.1 |
python-dateutil | 2.8.2 |
python-debian | 0.1.36ubuntu1 |
python-json-logger | 2.0.7 |
python-multipart | 0.0.6 |
pytz | 2022.7.1 |
PyWavelets | 1.4.1 |
pyxdg | 0.26 |
PyYAML | 5.3.1 |
pyzmq | 25.1.0 |
qtconsole | 5.4.3 |
QtPy | 2.3.1 |
rabbitvcs | 0.18 |
referencing | 0.29.1 |
regex | 2023.6.3 |
reportlab | 3.5.34 |
requests | 2.31.0 |
requests-oauthlib | 1.3.1 |
requests-unixsocket | 0.2.0 |
rfc3339-validator | 0.1.4 |
rfc3986-validator | 0.1.1 |
rich | 13.4.2 |
rpds-py | 0.8.8 |
rsa | 4.9 |
safetensors | 0.3.1 |
scikit-image | 0.19.3 |
scikit-learn | 1.1.2 |
scipy | 1.9.3 |
seaborn | 0.12.2 |
SecretStorage | 2.3.1 |
semantic-version | 2.10.0 |
Send2Trash | 1.8.2 |
sentencepiece | 0.1.97 |
sentry-sdk | 1.15.0 |
setproctitle | 1.3.2 |
setuptools | 67.8.0 |
sgt-launcher | 0.2.5 |
shortuuid | 1.0.11 |
simplejson | 3.16.0 |
sip | 4.19.21 |
six | 1.14.0 |
sklearn | 0 |
smmap | 5.0.0 |
sniffio | 1.3.0 |
soupsieve | 2.4.1 |
ssh-import-id | 5.1 |
stack-data | 0.6.2 |
starlette | 0.27.0 |
svgwrite | 1.4.3 |
sympy | 1.11.1 |
systemd-python | 234 |
tensorboard | 2.12.0 |
tensorboard-data-server | 0.7.0 |
tensorboard-logger | 0.1.0 |
tensorboard-plugin-wit | 1.8.1 |
termcolor | 2.3.0 |
terminado | 0.17.1 |
terminaltables | 3.1.10 |
thop | 0.1.1.post2209072238 |
threadpoolctl | 3.1.0 |
tifffile | 2022.10.10 |
tiktoken | 0.4.0 |
tinycss2 | 1.2.1 |
tokenizers | 0.13.3 |
tomli | 2.0.1 |
toolz | 0.12.0 |
torch | 1.13.1+cu116 |
torchaudio | 0.13.1+cu116 |
torchscope | 0.1.0 |
torchvision | 0.14.1+cu116 |
tornado | 6.2 |
tqdm | 4.65.0 |
traitlets | 5.9.0 |
transformers | 4.28.1 |
triton | 2.0.0 |
typing_extensions | 4.7.1 |
ubuntu-advantage-tools | 8001 |
ubuntu-drivers-common | 0.0.0 |
uc-micro-py | 1.0.2 |
ufw | 0.36 |
ultralytics | 8.0.109 |
unattended-upgrades | 0.1 |
uri-template | 1.3.0 |
urllib3 | 1.26.14 |
usb-creator | 0.3.7 |
uvicorn | 0.22.0 |
visdom | 0.2.4 |
wadllib | 1.3.3 |
wandb | 0.13.10 |
wavedrom | 2.0.3.post3 |
wcwidth | 0.1.8 |
webcolors | 1.13 |
webencodings | 0.5.1 |
websocket-client | 1.5.1 |
websockets | 11.0.3 |
Werkzeug | 2.2.3 |
wheel | 0.34.2 |
widgetsnbextension | 4.0.8 |
xcffib | 0.8.1 |
xkit | 0.0.0 |
xxx | 0.0.1 |
yapf | 0.32.0 |
yarl | 1.9.2 |
zipp | 3.15.0 |