torch.nn.Conv1d及一维卷积详解[通俗易懂]

全栈程序员-用户IM • 2022年4月13日下午6:00 • 未分类

大家好，又见面了，我是你们的朋友全栈君。

近日在搞wavenet，期间遇到了一维卷积，在这里对一维卷积以及其pytorch中的API进行总结，方便下次使用

之前对二维卷积是比较熟悉的，在初次接触一维卷积的时候，我以为是一个一维的卷积核在一条线上做卷积，但是这种理解是错的，一维卷积不代表卷积核只有一维，也不代表被卷积的feature也是一维。一维的意思是说卷积的方向是一维的。

下边首先看一个简单的一维卷积的例子（batchsize是1，也只有一个kernel）：

输入：

一个长度为35的序列，序列中的每个元素有256维特征，故输入可以看作(35,256)
卷积核: size = (k,) , (k = 2)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-aUj1VA0m-1573028650763)(https://ranchofromxgd.github.io/_posts/assets/2019-11-06-16-16-37.png)]

这幅图只说明了只有一个数据的情况，如果将数据打包成batch，可以用代码表示如下：

    from torch.autograd import Variable
    conv1 = nn.Conv1d(in_channels=256,out_channels = 100, kernel_size = 2)
    input = torch.randn(32, 35, 256)
    # batch_size x text_len x embedding_size -> batch_size x embedding_size x text_len
    input = input.permute(0, 2, 1)
    input = Variable(input)
    out = conv1(input)
    print(out.size())

输出：

torch.Size([32, 100, 34])

在分析这个结果之前先来看一下nn.Conv1d的官方文档

// 可以理解为特征的维度
in_channels – Number of channels in the input image 
//输出的通道数，可以理解为卷积核的数量
out_channels – Number of channels produced by the convolution
// 卷积核的大小，只需要指定卷积方向的大小（因为是一维的）
kernel_size – Size of the convolving kernel
stride – Stride of the convolution
padding – Zero-padding added to both sides of the input
dilation – Spacing between kernel elements
groups – Number of blocked connections from input channels to output channels
bias – If True, adds a learnable bias to the output