My Thesis for the Master's degree in Data Science at the University of Navarra.
This paper addresses the hypothesis of the existence of a relationship between the language used by users in Twitter about a company and its share price behavior. Specifically, the study takes as a reference the American company of electric vehicles and clean energy Tesla, which has been listened to on Twitter for a period of 8 months, and it is intended to relate the change of price in temporary spaces of ten minutes.
For this purpose, an exploratory analysis of the text has been carried out first obtained from social network publications containing the word Tesla, followed by the implementation of both supervised learning techniques as unsupervised. It should be noted that the present project does not have a investment objective, but merely a research objective in which a hypothesis and its veracity is studied.
Due to the size of the datasets, they are not provided in this repo. In case you are interested in using them, contact me in order to send them to you.