大家好,又见面了,我是你们的朋友全栈君。如果您正在找激活码,请点击查看最新教程,关注关注公众号 “全栈程序员社区” 获取激活教程,可能之前旧版本教程已经失效.最新Idea2022.1教程亲测有效,一键激活。
Jetbrains全系列IDE使用 1年只要46元 售后保障 童叟无欺
我正在尝试将一个pkl文件从csv起点加载到theano中
import numpy as np
import csv
import gzip, cPickle
from numpy import genfromtxt
import theano
import theano.tensor as T
#Open csv file and read in data
csvFile = “filename.csv”
my_data = genfromtxt(csvFile, delimiter=’,’, skip_header=1)
data_shape = “There are ” + repr(my_data.shape[0]) + ” samples of vector length ” + repr(my_data.shape[1])
num_rows = my_data.shape[0] # Number of data samples
num_cols = my_data.shape[1] # Length of Data Vector
total_size = (num_cols-1) * num_rows
data = np.arange(total_size)
data = data.reshape(num_rows, num_cols-1) # 2D Matrix of data points
data = data.astype(‘float32’)
label = np.arange(num_rows)
print label.shape
#label = label.reshape(num_rows, 1) # 2D Matrix of data points
label = label.astype(‘float32’)
print data.shape
#Read through data file, assume label is in last col
for i in range(my_data.shape[0]):
label[i] = my_data[i][num_cols-1]
for j in range(num_cols-1):
data[i][j] = my_data[i][j]
#Split data in terms of 70% train, 10% val, 20% test
train_num = int(num_rows * 0.7)
val_num = int(num_rows * 0.1)
test_num = int(num_rows * 0.2)
DataSetState = “This dataset has ” + repr(data.shape[0]) + ” samples of length ” + repr(data.shape[1]) + “. The number of training examples is ” + repr(train_num)
print DataSetState
train_set_x = data[:train_num]
train_set_y = label[:train_num]
val_set_x = data[train_num+1:train_num+val_num]
val_set_y = label[train_num+1:train_num+val_num]
test_set_x = data[train_num+val_num+1:]
test_set_y = label[train_num+val_num+1:]
# Divided dataset into 3 parts. split by percentage.
train_set = train_set_x, train_set_y
val_set = val_set_x, val_set_y
test_set = test_set_x, val_set_y
dataset = [train_set, val_set, test_set]
f = gzip.open(csvFile+’.pkl.gz’,’wb’)
cPickle.dump(dataset, f, protocol=2)
f.close()
当我通过Thenao(作为DBN或SdA)运行生成的pkl文件时,它预先训练得很好,这让我觉得数据存储正确 .
但是,当涉及到微调时,我收到以下错误:
epoch 1, minibatch 2775/2775, validation error 0.000000 %
Traceback (most recent call last):
File “SdA_custom.py”, line 489, in
test_SdA()
File “SdA_custom.py”, line 463, in test_SdA
test_losses = test_model()
File “SdA_custom.py”, line 321, in test_score
return [test_score_i(i) for i in xrange(n_test_batches)]
File “/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py”, line 606, in __call__
storage_map=self.fn.storage_map)
File “/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py”, line 595, in __call__
outputs = self.fn()
ValueError: Input dimension mis-match. (input[0].shape[0] = 10, input[1].shape[0] = 3)
Apply node that caused the error: Elemwise{neq,no_inplace}(argmax, Subtensor{int64:int64:}.0)
Inputs types: [TensorType(int64, vector), TensorType(int32, vector)]
Inputs shapes: [(10,), (3,)]
Inputs strides: [(8,), (4,)]
Inputs values: [‘not shown’, array([0, 0, 0], dtype=int32)]
Backtrace when the node is created:
File “/home/dean/Documents/DeepLearningRepo/DeepLearningTutorials-master/code/logistic_sgd.py”, line 164, in errors
return T.mean(T.neq(self.y_pred, y))
HINT: Use the Theano flag ‘exception_verbosity=high’ for a debugprint and storage map footprint of this apply node.
10是我的批次的大小,如果我改为批量大小为1,我得到以下内容:
ValueError: Input dimension mis-match. (input[0].shape[0] = 1, input[1].shape[0] = 0)
我认为我在制作pkl时错误地存储了标签,但我似乎无法发现正在发生的事情或为什么更改批处理会改变错误
希望你能帮忙!
发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/195494.html原文链接:https://javaforall.cn
【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛
【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...