本文目标:显示 convnet 过滤器响应的视觉模式。


在本示例中,我们将研究图像分类模型能学习到哪些视觉模式。我们将使用在 ImageNet 数据集上训练的 ResNet50V2 模型。


选在模型中间的某个位置:层 conv3_block4_out)中的特定滤波器。这些图像代表了过滤器响应模式的可视化。


import osos.environ["KERAS_BACKEND"] = "tensorflow"import keras
import numpy as np
import tensorflow as tf# The dimensions of our input image
img_width = 180
img_height = 180
# Our target layer: we will visualize the filters from this layer.
# See `model.summary()` for list of layer names, if you want to change this.
layer_name = "conv3_block4_out"


# Build a ResNet50V2 model loaded with pre-trained ImageNet weights
model = keras.applications.ResNet50V2(weights="imagenet", include_top=False)# Set up a model that returns the activation values for our target layer
layer = model.get_layer(name=layer_name)
feature_extractor = keras.Model(inputs=model.inputs, outputs=layer.output)


我们要最大化的 "损失 "只是目标层中特定滤波器激活的平均值。为避免边界效应,我们将边界像素排除在外。

def compute_loss(input_image, filter_index):activation = feature_extractor(input_image)# We avoid border artifacts by only involving non-border pixels in the loss.filter_activation = activation[:, 2:-2, 2:-2, filter_index]return tf.reduce_mean(filter_activation)


def gradient_ascent_step(img, filter_index, learning_rate):with tf.GradientTape() as tape:tape.watch(img)loss = compute_loss(img, filter_index)# Compute gradients.grads = tape.gradient(loss, img)# Normalize gradients.grads = tf.math.l2_normalize(grads)img += learning_rate * gradsreturn loss, img



* 从接近 "全灰 "的随机图像(即视觉上的净图像)开始
* 重复应用上文定义的梯度上升阶跃函数
* 通过对输入图像进行归一化处理、居中裁剪并将其限制在 [0, 255] 范围内,将生成的输入图像转换为可显示的形式。

def initialize_image():# We start from a gray image with some random noiseimg = tf.random.uniform((1, img_width, img_height, 3))# ResNet50V2 expects inputs in the range [-1, +1].# Here we scale our random inputs to [-0.125, +0.125]return (img - 0.5) * 0.25def visualize_filter(filter_index):# We run gradient ascent for 20 stepsiterations = 30learning_rate = 10.0img = initialize_image()for iteration in range(iterations):loss, img = gradient_ascent_step(img, filter_index, learning_rate)# Decode the resulting input imageimg = deprocess_image(img[0].numpy())return loss, imgdef deprocess_image(img):# Normalize array: center on 0., ensure variance is 0.15img -= img.mean()img /= img.std() + 1e-5img *= 0.15# Center cropimg = img[25:-25, 25:-25, :]# Clip to [0, 1]img += 0.5img = np.clip(img, 0, 1)# Convert to RGB arrayimg *= 255img = np.clip(img, 0, 255).astype("uint8")return img

让我们在目标图层中使用滤镜 0 试试看:

from IPython.display import Image, displayloss, img = visualize_filter(0)
keras.utils.save_img("0.png", img)

这就是目标层 0 号滤波器响应最大化的输入结果:


可视化目标层中的前 64 个滤波器

现在,让我们将目标层中的前 64 个滤波器做成一个 8x8 的网格,以了解模型学习到的不同视觉模式的范围。

# Compute image inputs that maximize per-filter activations
# for the first 64 filters of our target layer
all_imgs = []
for filter_index in range(64):print("Processing filter %d" % (filter_index,))loss, img = visualize_filter(filter_index)all_imgs.append(img)# Build a black picture with enough space for
# our 8 x 8 filters of size 128 x 128, with a 5px margin in between
margin = 5
n = 8
cropped_width = img_width - 25 * 2
cropped_height = img_height - 25 * 2
width = n * cropped_width + (n - 1) * margin
height = n * cropped_height + (n - 1) * margin
stitched_filters = np.zeros((width, height, 3))# Fill the picture with our saved filters
for i in range(n):for j in range(n):img = all_imgs[i * n + j]stitched_filters[(cropped_width + margin) * i : (cropped_width + margin) * i + cropped_width,(cropped_height + margin) * j : (cropped_height + margin) * j+ cropped_height,:,] = img
keras.utils.save_img("stiched_filters.png", stitched_filters)from IPython.display import Image, displaydisplay(Image("stiched_filters.png"))


图像分类模型通过将其输入分解为 "向量基 "纹理过滤器来观察世界。





