0.0
什么是gigapath
gigapath是一个由微软开发的数字病理学全玻片基础模型,用于从高分辨率图像(如病理切片图像)中提取和处理信息的深度学习模型架构。
图中分为abc三个部分
a
首先输入一张高清的病理图像,我们将它拆分成256*256的图像切片,从而可以逐块处理。
每个图像块会被输入到一个基于Vision Transformer(ViT)的编码器中,提取图像块级别的特征,得到图像块的嵌入表示。
图像块级别的CLS token(分类标记)用于表征整个图像块的全局信息
Slide-Level Encoder (LongNet):随后将这些图像块的嵌入表示传递到一个基于 LongNet(长序列处理网络)的 Slide-Level Encoder 中,该编码器采用 Dilated Attention(膨胀注意力机制)来捕捉不同图像块之间的长距离依赖关系,生成整个病理切片图像的嵌入表示。
b
Vision Transformer (Teacher Model):在教师模型中,图像块被分为多个“全局切片”(Global crops),用于生成准确的嵌入表示。
Vision Transformer (Student Model):学生模型接收局部切片(Local crops)和带掩码的全局切片。学生模型和教师模型通过对比学习(Contrastive Loss)进行对齐,确保学生模型在带掩码的输入下也能生成类似的特征。
c
LongNet-based Decoder:该部分显示了输入嵌入和目标嵌入的匹配过程,通过重构损失来指导解码器学习生成目标嵌入。
Reconstruction Loss(重构损失):通过计算生成嵌入与目标嵌入之间的重构损失,优化模型的生成效果。
最后得到嵌入表示的向量,成功将高维、复杂的数据(例如图像、文本、音频等)转换为低维向量,可以作为其他任务(如分类、聚类、检索等)的输入特征。这种特征是由模型学到的,因此具有通用性,能够适应多种任务需求。
0.1
模型部署
首先从/home/data/hf/Gigapath,将这里面的文件cp到本地
打开终端
命令
cp -a /home/data/hf/Gigapath .
然后根据Prov-GigaPath/Prov-GigaPath ·拥抱脸里面的教程
利用里面的environment.yaml创建我们的gigapath虚拟环境
cd Gigapath/prov-gigapath_github
conda env create -f environment.yaml
conda activate gigapath
pip install -e .
接着我们把cuda设置进环境变量里面
import os import torch import timm import numpy as np import gigapath.slide_encoder as slide_encoder from gigapath.pipeline import run_inference_with_tile_encoder, run_inference_with_slide_encoderprint("................")# 解压后输入图片所在目录 slide_dir = "/public/liujx/Gigapath/PANDA/PANDA_sample_tiles/054b6888604d963455bfff551518ece5" image_paths = [os.path.join(slide_dir, img) for img in os.listdir(slide_dir) if img.endswith('.png')] print(f"Found {len(image_paths)} image tiles")# 加载 tile_encoder 模型 model_arch = "vit_giant_patch14_dinov2" model_path = "/home/data/hf/Gigapath/prov-gigapath_hf/pytorch_model.bin" tile_encoder = timm.create_model(model_arch,pretrained=True,img_size=224,in_chans=3,pretrained_cfg_overlay=dict(file=model_path))# 打印参数数量 print("tile_encoder param #", sum(p.numel() for p in tile_encoder.parameters()))# 加载 slide_encoder 模型 slide_encoder_model = slide_encoder.create_model(pretrained="/public/liujx/Gigapath/prov-gigapath_hf/slide_encoder.pth",model_arch="gigapath_slide_enc12l768d",in_chans=1536, ) print("slide_encoder param #", sum(p.numel() for p in slide_encoder_model.parameters()))# 运行 tile_encoder 推理 tile_encoder_outputs = run_inference_with_tile_encoder(image_paths, tile_encoder)# 打印 tile_encoder 输出形状 for k in tile_encoder_outputs.keys():print(f"tile_encoder_outs[{k}].shape: {tile_encoder_outputs[k].shape}")# 运行 slide_encoder 推理 slide_embeds = run_inference_with_slide_encoder(slide_encoder_model=slide_encoder_model,**tile_encoder_outputs ) print(slide_embeds.keys())# 保存 slide_embeds 输出到 PyTorch .pt 文件 save_dir = "/public/liujx/Gigapath/prov-gigapath_github" os.makedirs(save_dir, exist_ok=True)# 保存 slide_embeds 为 .pt 文件格式 slide_embeds_path = os.path.join(save_dir, "slide_embeds.pt") torch.save(slide_embeds, slide_embeds_path) print(f"slide_embeds saved to {slide_embeds_path}")
#!/bin/bash #PBS -N test #PBS -o test_$PBS_JOBID.log #PBS -e test_$PBS_JOBID.err #PBS -l nodes=1:ppn=12 #PBS -q gpu cd $PBS_O_WORKDIRmodule add gcc/11.2.0 source /home/data/software/python/3.12.7/gigapath/bin/activateecho "test" python3 test_gigapath.py
接着由于原有的项目代码有bug,我们需要进行修改
vim /public/liujx/Gigapath/prov-gigapath_github/gigapath/torchscale/model/../../torchscale/architecture/config.py
输入i修改,在这个文件的第一行加入
import numpy as np
然后按Esc,再输入 :wq 保存
保存后回到刚刚你项目的地址,cd到你项目下
输入
/home/data/software/python/3.12.7/gigapath/lib/python3.12/site-packages/timm/models/registry.py:4: FutureWarning: Importing from timm.models.registry is deprecated, please import via timm.modelswarnings.warn(f"Importing from {__name__} is deprecated, please import via timm.models", FutureWarning) /home/data/software/python/3.12.7/gigapath/lib/python3.12/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layerswarnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) /public/liujx/Gigapath/prov-gigapath_github/gigapath/slide_encoder.py:236: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.state_dict = torch.load(local_path, map_location="cpu")["model"] /public/liujx/Gigapath/prov-gigapath_github/gigapath/pipeline.py:102: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.with torch.cuda.amp.autocast(dtype=torch.float16):Running inference with tile encoder: 0%| | 0/7 [00:00<?, ?it/s] Running inference with tile encoder: 14%|█▍ | 1/7 [00:17<01:47, 17.90s/it] Running inference with tile encoder: 29%|██▊ | 2/7 [00:35<01:29, 17.83s/it] Running inference with tile encoder: 43%|████▎ | 3/7 [00:56<01:16, 19.09s/it] Running inference with tile encoder: 57%|█████▋ | 4/7 [01:16<00:58, 19.58s/it] Running inference with tile encoder: 71%|███████▏ | 5/7 [01:33<00:37, 18.56s/it] Running inference with tile encoder: 86%|████████▌ | 6/7 [01:47<00:17, 17.14s/it] Running inference with tile encoder: 100%|██████████| 7/7 [01:54<00:00, 13.70s/it] Running inference with tile encoder: 100%|██████████| 7/7 [01:54<00:00, 16.34s/it] /public/liujx/Gigapath/prov-gigapath_github/gigapath/pipeline.py:130: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.with torch.cuda.amp.autocast(dtype=torch.float16):
并且log文件如下
test ................ Found 810 image tiles tile_encoder param # 1134769664 dilated_ratio: [1, 2, 4, 8, 16] segment_length: [np.int64(1024), np.int64(5792), np.int64(32768), np.int64(185363), np.int64(1048576)] Number of trainable LongNet parameters: 85148160 Global Pooling: False [92m Successfully Loaded Pretrained GigaPath model from /public/liujx/Gigapath/prov-gigapath_hf/slide_encoder.pth [00m slide_encoder param # 86330880 tile_encoder_outs[tile_embeds].shape: torch.Size([810, 1536]) tile_encoder_outs[coords].shape: torch.Size([810, 2]) dict_keys(['layer_0_embed', 'layer_1_embed', 'layer_2_embed', 'layer_3_embed', 'layer_4_embed', 'layer_5_embed', 'layer_6_embed', 'layer_7_embed', 'layer_8_embed', 'layer_9_embed', 'layer_10_embed', 'layer_11_embed', 'layer_12_embed', 'last_layer_embed']) slide_embeds saved to /public/liujx/Gigapath/prov-gigapath_github/slide_embeds.pt
说明你已经成功加载号预训练的模型,并且成功运行起来了
这时你的文件目录下会多出一个slide_embeds.pt的文件,这是一个pytorch模式下的张量文件,我们可以通过一下代码打开并查看这个文件
import torch import pandas as pd# 加载 .pt 文件 slide_embeds_path = "/public/liujx/Gigapath/prov-gigapath_github/slide_embeds.pt" slide_embeds = torch.load(slide_embeds_path)# 创建一个空的 DataFrame 来存储结果 data_dict = {}# 将每个层的嵌入转换为展平的一维数组,并添加到字典中 for key, tensor in slide_embeds.items():# 如果是二维张量,展平为一维flattened_tensor = tensor.cpu().numpy().flatten()data_dict[key] = flattened_tensor# 将字典转换为 DataFrame df = pd.DataFrame(data_dict)# 保存 DataFrame 为 CSV 文件 csv_file_path = "/public/liujx/Gigapath/prov-gigapath_github/slide_embeds.csv" df.to_csv(csv_file_path, index=False)# 打印 CSV 文件路径 print(f"CSV file saved at {csv_file_path}")
将这段代码保存在.pt文件所在的目录下,运行,可以将这个.pt文件转换为更为直观,可以通过本地excel直接打开的csv文件。
接着我们进行微调,