python实现卷积操作

全栈程序员-用户IM • 2022年5月28日上午6:20 • 未分类

大家好，又见面了，我是你们的朋友全栈君。

文章目录

调用tf.nn.conv2d()实现卷积
自己实现卷积函数

我们知道，tensorflow里面自带卷积函数，tf.nn.conv2d()就可以实现相关功能，本文主要是自己实现卷积操作，然后和tf.nn.conv2d()函数的结果对比，验证正确性。

调用tf.nn.conv2d()实现卷积

首先是调用卷积函数实现卷积操作：
这里说明一下conv2d的定义及参数含义：参考
【定义：】
tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)
【参数：】
input : 输入的要做卷积的图片，要求为一个张量，shape为 [ batch, in_height, in_weight, in_channel ]，其中batch为图片的数量，in_height 为图片高度，in_weight 为图片宽度，in_channel 为图片的通道数，灰度图该值为1，彩色图为3。（也可以用其它值，但是具体含义不是很理解）
filter： 卷积核，要求也是一个张量，shape为 [ filter_height, filter_weight, in_channel, out_channels ]，其中 filter_height 为卷积核高度，filter_weight 为卷积核宽度，in_channel 是图像通道数，和 input 的 in_channel 要保持一致，out_channel 是卷积核数量。
strides： 卷积时在图像每一维的步长，这是一个一维的向量，[ 1, strides, strides, 1]，第一位和最后一位固定必须是1
padding： string类型，值为“SAME” 和 “VALID”，表示的是卷积的形式，是否考虑边界。”SAME”是考虑边界，不足的时候用0去填充周围，”VALID”则不考虑
use_cudnn_on_gpu： bool类型，是否使用cudnn加速，默认为true

import tensorflow as tf
import numpy as np
input = np.array([[1,1,1,0,0],[0,1,1,1,0],[0,0,1,1,1],[0,0,1,1,0],[0,1,1,0,0]])
input = input.reshape([1,5,5,1]) #因为conv2d的参数都是四维的，因此要reshape成四维
kernel = np.array([[1,0,1],[0,1,0],[1,0,1]])
kernel = kernel.reshape([3,3,1,1]) #kernel也要reshape
print(input.shape,kernel.shape) #(1, 5, 5, 1) (3, 3, 1, 1)

x = tf.placeholder(tf.float32,[1,5,5,1])
k = tf.placeholder(tf.float32,[3,3,1,1])
output = tf.nn.conv2d(x,k,strides=[1,1,1,1],padding='VALID')

with tf.Session() as sess:
    y = sess.run(output,feed_dict={ 
   x:input,k:kernel})
    print(y.shape) #(1,3,3,1)
    print(y) #因为y有四维，输出太长了，我就只写一下中间两维的结果（3*3部分）：[[4,3,4],[2,4,3],[2,3,4]]

自己实现卷积函数

下面我们自己实现一个卷积操作，就不care batch和channel那两维了，直接拿中间的二维为例。下面是实现的代码（我这个太偷懒了，步长、padding这些都没有考虑进去）：

import numpy as np
input = np.array([[1,1,1,0,0],[0,1,1,1,0],[0,0,1,1,1],[0,0,1,1,0],[0,1,1,0,0]])
kernel = np.array([[1,0,1],[0,1,0],[1,0,1]])
print(input.shape,kernel.shape)

def my_conv(input,kernel):
    output_size = (len(input)-len(kernel)+1)
    res = np.zeros([output_size,output_size],np.float32)
    for i in range(len(res)):
        for j in range(len(res)):
            res[i][j] = compute_conv(input,kernel,i,j)
    return res

def compute_conv(input,kernel,i,j):
    res = 0
    for kk in range(3):
        for k in range(3):
            print(input[i+kk][j+k])
            res +=input[i+kk][j+k]*kernel[kk][k]  #这句是关键代码，实现了两个矩阵的点乘操作
    return res
print(my_conv(input,kernel))