-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSTM: Many to many sequence prediction with different sequence length #6063
Comments
I think you need to do something like this:
otherwise you are just repeating the the last Dense layer and getting a constant value. |
Thank you very much. I tried your suggestion and the predictions now look like this: The number of epochs and hidden neurons is the same as in the other testcases, but the prediction is worse for 10 steps compared to 50. Is there a (simple) explanation why it gets worse with more layers? Or does it just need to train longer because it has more parameters to adjust? |
I would say that the modeling assumptions of both approaches are different. In the later model, it is assumed that the model sees the complete input sequence (first 50 steps), somehow creates a summary and uses this summary to generate a new signal (last 10 steps). On the other hand, your initial model estimated the last 50 steps while reading the input signal, no summarisation of the original signal was used. |
That's a perfect and clear answer, thank you very much. |
HI, I have been studying how to use the many to many model of lstm to predict time series data, and now I have the same problem that you once had, could you share your demo py files about predicting a simple sine wave to me ? I mean i want to learn your code and replace your data with mine just to have a try. it will be very nice of you if you could do me a favor! thanks first ! thank you ! |
Here you go! |
HI there!
I also have another question. what is the reason of using |
Hi @bestazad , You can obtain the same result using The reasoning the why |
How would you train the model on variable input length? |
Hi @gustavz Two options/suggestions:
Padding looks easier but I would guess that this method also decreases the usefulness of the model. If you (or anybode else) could help me with a good explanation of what |
what is the difference between:
and
maybe best explained with this image |
Option 1 is an Encoder-Decoder, Option 2 is a Vanilla LSTM |
Option 1 is part 4 of the image? |
that the code that does not contains RepeatVector() is a many-to-one architecture (variant 3). To have a many-to-many architecture you have to mdofy the code that does not contain RepeatVector() to have return_sequences=True in both LSTM layer and not only only the first layer. |
Can anybody will help me in how to write the code for 5th case of the above image? |
Anyone to help me with forecasting Time Series using CNN-LSTM, I tried but it doesn't attach the forecasting part to the testing Data. |
First of all, I know that there are already issues open regarding that topic, but their solutions don't solve my problem and I'll explain why.
The problem is to predict the next
n_post
steps of a sequence givenn_pre
steps of it, withn_pre < n_post
. I've built a toy example using a simple sine wave to illustrate it. The many to one forecast(n_pre=50, n_post=1)
works perfectly:Also, the many to many forecast with
(n_pre=50, n_post=50)
gives a near perfect fit:But now assume we have data that looks like this:
dataX or input:
(nb_samples, nb_timesteps, nb_features) -> (1000, 50, 1)
dataY or output:
(nb_samples, nb_timesteps, nb_features) -> (1000, 10, 1)
The solution given in #2403 is to build the model like this:
Well, it compiles and trains, but the prediction is really bad:
My explanation to this is: The network has only one piece of information (no return_sequences) at the end of the LSTM layer, repeats this output_dimension-times and then tries to fit. The best guess it can give is the average of all the points to predict as it doesn't know whether it is currently going down or up in the sinus wave, it loses this information with
return_sequences=False
!So, my final question is: How can I keep this information and let the LSTM layer return a part of its sequence? Because I don't want to fit it to
n_pre=50
time steps but only to 10 because in my problem, the points are not so nicely correlated as in the sine wave of course. Currently I just give 50 points and then crop the output (after training) to 10 but it still tries to fit to all 50, which distorts the result.Any help would be greatly appreciated!
The text was updated successfully, but these errors were encountered: