감정 기반 이미지-음악 추천시스템
- 팀명 : E = MusiC^2
- 팀원 : 8기 유채원 장준혁 최윤서 9기 김서진 서연우
- image_to_music.py : final end-to-end file to get the result
- image_to_music_module.py : modules used in image_to_music.py
- image_to_sentiment
- data
- OASIS_with_minmaxscaling.csv
- dataset.py : get data and transform it into runnable format
- run.py : python file to receive arguments and run exp.py file
- exp.py : python file actually run to train
- utils.py : define functions for minor changes or transformations
- vgg.sh : pretrained VGG19_bn model
- data
- music_to_sentiment
- arousal_valence_prediction.py : file to get music's sentiment
- data
- NRC-VAD-Lexicon.txt
- da
- song_lyric.csv
- song_lyric_VA.csv
- song_lyric_embedding_300.csv
- song_lyric_embedding_pca_11.csv
- song_music_8_characteristics.csv
- song_normalized_VA_label.csv
- preprocess : preprocessing modules for VA model
- api_module.py
- da
- lyric_VA.py
- lyrics_to_vector.py
- music_preprocessing.py
- music_to_csv.py
- preprocess_module.py
- result
- arousal.pkl
- Presentation pdf
- Presentation Youtube
- 영상 안의 다양한 이미지에 어울리는 음악을 추천.
- Ekman's 6 feelings 에 기반한 Valence, Arousal 지표를 통해 이미지와 음악의 감성을 분석, image 2 music recommendation을 구현하고자 함.
- OASIS (Open Affective Standardized Image Set) : 900개의 이미지에 대해 274명의 사람이 valence, arousal을 평가
- Spotify API : Spotify data 크롤링을 통해 아티스트, 앨범, 곡명 및 valence, arousal을 포함한 음악의 정량적 지표를 수집
- Musixmatch API : Spotify API 기반으로 크롤링한 곡들의 가사를 수집
- NRC VAD Lexicon : 캐나다 NRC 제작, 약 2만 여개의 단어에 대해 Valence, Arousal, Dominance 정보를 담고 있음.
-
Overview
-
Image2Emotion : pretrained CNN model을 feature extractor로 사용해 valence, arousal 지표를 예측하도록 함
- model
- Feature extractor : VGG19_bn
- Classifier : Linear layer
- Loss : MSE loss (valence + arousal)
- model
-
- Regression task to predict VA with Spotify data's columns
- Used 8 features(danceability, key, loudness, mode, speechiness, instrumentalness, liveness, tempo)
- Preprocessed with log normalization and min-max scaling
- Lyric Embedding : Word2Vec + Weighted sum based on counts (NRC VAD Lexicon)
- Used pretrained model(fse/word2vec-google-news-300(Hugging Face)) -> fine tuning -> PCA
- AutoML : chose best 3 model and stacked them to make new model
- Valence model : Gradient Boost Regressor + Random Forest Regressor + Extra Trees Regressor
- Input : music + lyric embedding PCA + lyric VA
- Arousal model: Gradient Boost Regressor + Random Forest Regressor + LGBM Regressor
- Input : music + lyric VA
- Valence model : Gradient Boost Regressor + Random Forest Regressor + Extra Trees Regressor
- Regression task to predict VA with Spotify data's columns
-
- Used Euclidean distance as similarity measure(with sklearn)
- Matt McVicar, Bruno Di Giorgi, Baris Dundar, and Matthias Mauch. (2021). Lyric document embeddings for music tagging
- James A. Russell. (1980). A Circumplex Model of Affect
- Grekow, J. (2016). Music Emotion Maps in Arousal-Valence Space.
- Fika Hastarita Rachman, RiyanartoSarno, ChastineFatichah. (2019). Song Emotion Detection Based on Arousal-Valencefrom Audio and Lyrics Using Rule Based Method
- Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra. (2021). Emotion Embedding Spaces for Matching Music to Stories
- Github hudsonbrendon/python-musixmatch
- 스포티파이 API로 음악 분석하기
- 추천시스템 이해
- Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2018. Deep Learning based Recommender System: A Survey and New Perspectives. ACM Comput. Surv. 1, 1, Article 1 (July 2018), 35 pages.
- Shankar, Devashish & Narumanchi, Sujay & Ananya, H & Kompalli, Pramod & Chaudhury, Krishnendu. (2017). Deep Learning based Large Scale Visual Recommendation and Search for E-Commerce.
- 카카오 AI 추천 : 카카오의 콘텐츠 기반 필터링
- 이승진. 2019. 음악 추천을 위한 가사정보 및 음악신호 기반 특성 탐색 연구. 서울대학교 융합과학기술대학원 석사학위 논문.
- X. He and L. Deng, "Deep Learning for Image-to-Text Generation: A Technical Overview," in IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 109-116, Nov. 2017, doi: 10.1109/MSP.2017.2741510.
- Pan, Y., Mei, T., Yao, T., Li, H., & Rui, Y. (2015). Jointly Modeling Embedding and Translation to Bridge Video and Language. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4594-4602.
- Yin, Pei & Zhang, Liang. (2020). Image Recommendation Algorithm Based on Deep Learning. IEEE Access. PP. 1-1. 10.1109/ACCESS.2020.3007353.
- Lee, S. J., Seo, B.-G., & Park, D.-H. (2018). Development of Music Recommendation System based on Customer Sentiment Analysis. Journal of Intelligence and Information Systems, 24(4), 197–217. https://doi.org/10.13088/JIIS.2018.24.4.197