question about training data #6

Fangwq · 2019-10-11T05:14:31Z

Hi, I read the code carefully. I am a little confused about training data. I check the training data. It seems that the X, Y generated by generate_training_data function doesn't have too much difference(see the following figure: X[:30, :3]- Y[:30, :3], the data is got from car_example.py file). Why do you generate the training data like that? If I replace it with (X, X+ random noise), doesn't it make any difference ? Thank you very much!

The text was updated successfully, but these errors were encountered:

helgeanl · 2019-10-12T22:38:01Z

The generate_training_data function uses a Latin hypercube to randomly select samples that cover the five-dimensional space in the car model (3 states + 2 inputs) to get the X matrix. A more thorough explanation of this is covered in my thesis that I link to in the README file. For each sample we simulate one timestep 50ms forward to get the Y matrix to sample the dynamics in the model. The added noise to the Y matrix simulate measurement noise. Since we only integrate one timestep, the difference between Y and X will be small, especially with samples where the input values (u) are low. We could have decided on a longer sampling period, but then we risk aliasing issues where the GP model will not be able to learn the fast dynamics in the car model.

If you add random noise to the X matrix (that is within the limits of the state space) it would not matter since the X matrix itself is by definition random distributed. It is also possible to have training data of measured/ simulated time-series in the X matrix instead of the one-shot samples, but then we get a lot of redundant data, where we would need quite a lot more data points to pick up all the system dynamics. I think I also have a brief discussion about this in the thesis related to online learning. In a real system, it would not be feasible to use a Latin hypercube to gather training data, but since we have the simulated model we have the option to cheat and get optimal distributed training data.

Fangwq · 2019-10-13T12:45:10Z

Thank you very much for your detailed explanation !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about training data #6

question about training data #6

Fangwq commented Oct 11, 2019

helgeanl commented Oct 12, 2019

Fangwq commented Oct 13, 2019

question about training data #6

question about training data #6

Comments

Fangwq commented Oct 11, 2019

helgeanl commented Oct 12, 2019

Fangwq commented Oct 13, 2019