Team PH at the COVID-19 Retweet Prediction Challenge at CIKM2020 Analyticup

This repository contains the code and other resources for our proposed solution for the COVID-19 Retweet Prediction Challenge at CIKM2020 Analyticup. The proposed solution ranked 4th on the final testing leaderboard among 20 teams (51 teams in the validation phase). The pre-print of our report is available here.

Requirements

Download relevant data files from here with password: cgfm, unpack any tar.gz files in the model directory
- model directory contains trained models
- tmp directory contains extracted data for training and prediction
- data directory contains raw data provided by the challenge origranizers
Check and install relevant packages based on requrirements.txt - pip install requirements.txt
All experiments are run with a commodity laptop (Intel CoreI5 processor at 2.6 GHz, 8GB of RAM, and with 200GB swap space)

Scripts

test.py loads the test data and run different RERFs to get the prediction results in the output directory and ensembles those results. Afterwards, it applied personalized patching to update the final prediction results for users having a sufficient number of tweets in the training set.
train.py contains code for training different RERFs, which are used in the test.py.
utils.py contains necessary utility functions for train.py and test.py.

Models used for the global ensemble

No.	RERF Details	Weight
1	LinearRegression(fit_intercept=False) RandomForestRegressor(n_estimators=500, max_depth=20, random_state=7)	1
2	MLPRegressor(batch_size=1024, hidden_layer_sizes=(64,32,16,8,8), random_state=7) RandomForestRegressor(n_estimators=1000, max_depth=18, random_state=7)	1
3	MLPRegressor(batch_size=2048, hidden_layer_sizes=(128,64,32,16,8,8), random_state=77) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=77)	1
4	MLPRegressor(batch_size=2048, hidden_layer_sizes=(128,64,32,16,8), random_state=18) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=18)	1
5	MLPRegressor(batch_size=2048, hidden_layer_sizes=(64,16,8), random_state=19) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=19)	1
6	MLPRegressor(batch_size=2048, hidden_layer_sizes=(128,64,16,8), random_state=20) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=20)	1
7	MLPRegressor(batch_size=4096, hidden_layer_sizes=(128,64,16,8,4), random_state=201) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=201)	1
8	MLPRegressor(batch_size=4096, hidden_layer_sizes=(128,64,32,8), random_state=211) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=211)	2
9	MLPRegressor(batch_size=4096, hidden_layer_sizes=(128,64,32,8,8), random_state=22) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=22)	1
10	MLPRegressor(batch_size=4096, hidden_layer_sizes=(128,64,32,16,8), random_state=27) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=27)	1
11	MLPRegressor(batch_size=4096, hidden_layer_sizes=(128,64,32,32,16), random_state=211) RandomForestRegressor(n_estimators=500, max_depth=18, random_state=28)	1
12	xDeepFM() RandomForestRegressor(n_estimators=500, max_depth=16, random_state=28)	1
13	DeepFM() RandomForestRegressor(n_estimators=500, max_depth=16, random_state=29)	2
14	DeepFM() RandomForestRegressor(n_estimators=500, max_depth=16, random_state=28)	1

Citation

Guangyuan Piao and Weipeng Huang, "Regression-enhanced Random Forests with Personalized Patching for COVID-19 Retweet Prediction", CIKM Analyticup Workshop at CIKM'20, Galway, Ireland, 2020. [PDF] [bibtex] (CKIM Analyticup Proceedings-http://ceur-ws.org/Vol-2881/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Team PH at the COVID-19 Retweet Prediction Challenge at CIKM2020 Analyticup

Requirements

Scripts

Models used for the global ensemble

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
output		output
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
utils.py		utils.py

parklize/cikm2020-analyticup

Folders and files

Latest commit

History

Repository files navigation

Team PH at the COVID-19 Retweet Prediction Challenge at CIKM2020 Analyticup

Requirements

Scripts

Models used for the global ensemble

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages