LSTM: Many to many sequence prediction with different sequence length #6063

Ironbell · 2017-03-30T09:47:00Z

First of all, I know that there are already issues open regarding that topic, but their solutions don't solve my problem and I'll explain why.

The problem is to predict the next n_post steps of a sequence given n_pre steps of it, with n_pre < n_post. I've built a toy example using a simple sine wave to illustrate it. The many to one forecast (n_pre=50, n_post=1) works perfectly:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(Dense(1))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

Also, the many to many forecast with (n_pre=50, n_post=50) gives a near perfect fit:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

But now assume we have data that looks like this:
dataX or input: (nb_samples, nb_timesteps, nb_features) -> (1000, 50, 1)
dataY or output: (nb_samples, nb_timesteps, nb_features) -> (1000, 10, 1)

The solution given in #2403 is to build the model like this:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(RepeatVector(10))
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

Well, it compiles and trains, but the prediction is really bad:

My explanation to this is: The network has only one piece of information (no return_sequences) at the end of the LSTM layer, repeats this output_dimension-times and then tries to fit. The best guess it can give is the average of all the points to predict as it doesn't know whether it is currently going down or up in the sinus wave, it loses this information with return_sequences=False!

So, my final question is: How can I keep this information and let the LSTM layer return a part of its sequence? Because I don't want to fit it to n_pre=50 time steps but only to 10 because in my problem, the points are not so nicely correlated as in the sine wave of course. Currently I just give 50 points and then crop the output (after training) to 10 but it still tries to fit to all 50, which distorts the result.

Any help would be greatly appreciated!

The text was updated successfully, but these errors were encountered:

javiercorrea · 2017-03-30T14:28:02Z

I think you need to do something like this:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(RepeatVector(10))
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))   
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

otherwise you are just repeating the the last Dense layer and getting a constant value.

Ironbell · 2017-03-30T21:26:47Z

Thank you very much. I tried your suggestion and the predictions now look like this:

The number of epochs and hidden neurons is the same as in the other testcases, but the prediction is worse for 10 steps compared to 50. Is there a (simple) explanation why it gets worse with more layers? Or does it just need to train longer because it has more parameters to adjust?

javiercorrea · 2017-03-31T18:03:30Z

I would say that the modeling assumptions of both approaches are different. In the later model, it is assumed that the model sees the complete input sequence (first 50 steps), somehow creates a summary and uses this summary to generate a new signal (last 10 steps).

On the other hand, your initial model estimated the last 50 steps while reading the input signal, no summarisation of the original signal was used.

Ironbell · 2017-03-31T18:20:22Z

That's a perfect and clear answer, thank you very much.

ghost · 2017-06-30T12:35:56Z

HI, I have been studying how to use the many to many model of lstm to predict time series data, and now I have the same problem that you once had, could you share your demo py files about predicting a simple sine wave to me ? I mean i want to learn your code and replace your data with mine just to have a try. it will be very nice of you if you could do me a favor! thanks first !
my email : zhangping16@mails.ucas.ac.cn

thank you !

Ironbell · 2017-06-30T13:43:48Z

Here you go!
test_sine.txt

bestazad · 2019-03-01T16:26:29Z

HI there!
It seems in new versions of Keras the input_dim and output_dim arguments are replaced with input_shape() function. So may you edit this parts of code to match to the new version:


model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))

I also have another question. what is the reason of using model.add(Activation('linear')) ?
Thanks in advanced!

pusj · 2019-03-05T20:37:20Z

Hi @bestazad ,

You can obtain the same result using input_dim() or input_shape(), to my knowledge both these two "alternatives" has been used for quite some time

https://stackoverflow.com/questions/53106111/in-keras-when-should-i-use-input-shape-instead-of-input-dim

The reasoning the why model.add(Activation('linear')) is used is most likely because this is (only) tentative example, other activation functions can probably give similar results here.

gustavz · 2019-07-08T14:35:16Z

How would you train the model on variable input length?

pusj · 2019-07-08T16:14:33Z

Hi @gustavz

Two options/suggestions:

Padding https://machinelearningmastery.com/data-preparation-variable-length-input-sequences-sequence-prediction/
Sequence bucketing https://arxiv.org/ftp/arxiv/papers/1708/1708.05604.pdf

Padding looks easier but I would guess that this method also decreases the usefulness of the model.

If you (or anybode else) could help me with a good explanation of what RepeatVector() does here I would be happy, this is the best reference https://stackoverflow.com/questions/51749404/how-to-connect-lstm-layers-in-keras-repeatvector-or-return-sequence-true , however, this is for a Encode/Decoder network and I'm not sure if this is the same for a LSTM network. E.g., does RepeatVector() that the original input (from the very first layer) or does RepeatVector() work with the inputs/outputs between hidden layers?

gustavz · 2019-07-09T07:35:58Z

what is the difference between:

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))  
model.add(RepeatVector(10))
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))

and

model = Sequential()  
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=True))  
model.add(LSTM(output_dim=hidden_neurons, return_sequences=False))  
model.add(Dense(10))

maybe best explained with this image

pusj · 2019-07-09T09:04:06Z

Thanks for this; which is which? I've added some numbers to your image to better reference the variants. I assume that the code that contains RepeatVector() is represented by variant 4 and that the code that does not contains RepeatVector() is represented by variant 5. Is this correct?

Thanks! :-)

gustavz · 2019-07-09T09:37:24Z

Option 1 is an Encoder-Decoder, Option 2 is a Vanilla LSTM

byamao1 · 2020-02-06T08:11:04Z

Option 1 is part 4 of the image?

0xsimulacra · 2020-03-10T18:19:38Z

Thanks for this; which is which? I've added some numbers to your image to better reference the variants. I assume that the code that contains RepeatVector() is represented by variant 4 and that the code that does not contains RepeatVector() is represented by variant 5. Is this correct?

Thanks! :-)

that the code that does not contains RepeatVector() is a many-to-one architecture (variant 3). To have a many-to-many architecture you have to mdofy the code that does not contain RepeatVector() to have return_sequences=True in both LSTM layer and not only only the first layer.

akshat-suwalka · 2021-07-17T13:25:40Z

Can anybody will help me in how to write the code for 5th case of the above image?
Specifically in keras.

GODJOSE27 · 2021-11-01T15:48:12Z

Anyone to help me with forecasting Time Series using CNN-LSTM, I tried but it doesn't attach the forecasting part to the testing Data.
You can reach me via godjose70@yahoo.com

Ironbell closed this as completed Mar 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM: Many to many sequence prediction with different sequence length #6063

LSTM: Many to many sequence prediction with different sequence length #6063

Ironbell commented Mar 30, 2017

javiercorrea commented Mar 30, 2017

Ironbell commented Mar 30, 2017

javiercorrea commented Mar 31, 2017

Ironbell commented Mar 31, 2017

ghost commented Jun 30, 2017

Ironbell commented Jun 30, 2017

bestazad commented Mar 1, 2019

pusj commented Mar 5, 2019

gustavz commented Jul 8, 2019

pusj commented Jul 8, 2019

gustavz commented Jul 9, 2019

pusj commented Jul 9, 2019

gustavz commented Jul 9, 2019 •

edited

Loading

byamao1 commented Feb 6, 2020

0xsimulacra commented Mar 10, 2020

akshat-suwalka commented Jul 17, 2021 •

edited

Loading

GODJOSE27 commented Nov 1, 2021

LSTM: Many to many sequence prediction with different sequence length #6063

LSTM: Many to many sequence prediction with different sequence length #6063

Comments

Ironbell commented Mar 30, 2017

javiercorrea commented Mar 30, 2017

Ironbell commented Mar 30, 2017

javiercorrea commented Mar 31, 2017

Ironbell commented Mar 31, 2017

ghost commented Jun 30, 2017

Ironbell commented Jun 30, 2017

bestazad commented Mar 1, 2019

pusj commented Mar 5, 2019

gustavz commented Jul 8, 2019

pusj commented Jul 8, 2019

gustavz commented Jul 9, 2019

pusj commented Jul 9, 2019

gustavz commented Jul 9, 2019 • edited Loading

byamao1 commented Feb 6, 2020

0xsimulacra commented Mar 10, 2020

akshat-suwalka commented Jul 17, 2021 • edited Loading

GODJOSE27 commented Nov 1, 2021

gustavz commented Jul 9, 2019 •

edited

Loading

akshat-suwalka commented Jul 17, 2021 •

edited

Loading