I am trying to design a simple lstm in tensorflow. I want to classify a sequence of data into classes from 1 to 10.
I have 10 timestamps and data X. I am only taking one sequence for now, so my batch size = 1.
At every epoch, a new sequence is generated. For example X is a numpy array like this-
X [[ 2.52413028 2.49449348 2.46520466 2.43625973 2.40765466 2.37938545 2.35144815 2.32383888 2.29655379 2.26958905]]
To make it suitable for lstm input, I first converted in to a tensor and then reshaped it (batch_size, sequence_lenght, input dimension) –
X= np.array([amplitude * np.exp(-t / tau)]) print 'X', X #Sorting out the input train_input = X train_input = tf.convert_to_tensor(train_input) train_input = tf.reshape(train_input,[1,10,1]) print 'ti', train_input
For output I am generating a one hot encoded label within a class range of 1 to 10.
#------------sorting out the output train_output= [int(math.ceil(tau/resolution))] train_output= one_hot(train_output, num_labels=10) print 'label', train_output train_output = tf.convert_to_tensor(train_output) >>label [[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]
Then I created the placeholders for tensorflow graph, made the lstm cell and gave weights and bias-
data = tf.placeholder(tf.float32, shape= [batch_size,len(t),1]) target = tf.placeholder(tf.float32, shape = [batch_size, num_classes]) cell = tf.nn.rnn_cell.LSTMCell(num_hidden) output, state = rnn.dynamic_rnn(cell, data, dtype=tf.float32) weight = tf.Variable(tf.random_normal([batch_size, num_classes, 1])), bias = tf.Variable(tf.random_normal([num_classes])) #training prediction = tf.nn.softmax(tf.matmul(output,weight) + bias) cross_entropy = -tf.reduce_sum(target * tf.log(prediction)) optimizer = tf.train.AdamOptimizer() minimize = optimizer.minimize(cross_entropy)
I have written the code this far and got error at the training step. Is it to do with the input shapes? Here is the traceback—
Traceback (most recent call last):
File "/home/raisa/PycharmProjects/RNN_test1/test3.py", line 66, in <module> prediction = tf.nn.softmax(tf.matmul(output,weight) + bias) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1036, in matmul name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 911, in _mat_mul transpose_b=transpose_b, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 655, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2156, in create_op set_shapes_for_outputs(ret) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1612, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/common_shapes.py", line 81, in matmul_shape a_shape = op.inputs.get_shape().with_rank(2) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 625, in with_rank raise ValueError("Shape %s must have rank %d" % (self, rank)) ValueError: Shape (1, 10, 5) must have rank 2
If you are using TF >= 1.0, you can take advantage of the
tf.contrib.rnn library and the
OutputProjectionWrapper to add a fully connected layer to the output of your RNN. Something like:
# Network definition. cell = tf.contrib.rnn.LSTMCell(num_hidden) cell = tf.contrib.rnn.OutputProjectionWrapper(cell, num_classes) # adds an output FC layer for you output, state = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32) # Training. cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=targets) cross_entropy = tf.reduce_sum(cross_entropy) optimizer = tf.train.AdamOptimizer() minimize = optimizer.minimize(cross_entropy)
Note I’m using
softmax_cross_entropy_with_logits instead of using your
prediction op and calculating cross entropy manually. It is supposed to be more efficient and robust.
OutputProjectionWrapper basically does the same thing, but it might help alleviate some headaches.
Looking at your code, your rnn output should have a dimension of
batch_size x 1 x num_hidden while your w has dimension
batch_size x num_classes x 1 however you want multiplication of those two to be
batcH_size x num_classes.
Can you try
output = tf.reshape(output, [batch_size, num_hidden]) and
weight = tf.Variable(tf.random_normal([num_hidden, num_classes])) and let me know how that goes?