NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

NewsMTSC is a dataset for target-dependent sentiment classification (TSC) on news articles reporting on policy issues. The dataset consists of more than 11k labeled sentences, which we sampled from news articles from online US news outlets. More information can be found in our paper published at the EACL 2021.

This repository contains the dataset for target-dependent sentiment classification in news articles reporting on policy issues. Additionally, the repository contains our model named GRU-TSC, which achieves state-of-the-art TSC classification performance on NewsMTSC. Check it out - it works out of the box :-)

This readme consists of the following parts:

If you are only looking for the dataset, you can download it here or view it here.

To make the model available also to users without programming skills, we aimed to make the installation and using the model as easy as possible. If you face any issue with using the model or notice an issue in our dataset, you are more than welcome to open an issue.

Installation

It's super easy, we promise!

NewsMTSC was tested on MacOS and Ubuntu; other OS may work, too. Let us know :-)

1. Setup the environment:

This step is optional if you have Python 3.7 installed already (python --version). If you don't have Pthon 3.7, we recommend using Anaconda for setting up requirements. If you do not have it yet, follow Anaconda's installation instructions.

To setup a Python 3.7 environment (in case you don't have one yet) you may use, for example:

conda create --yes -n newsmtsc python=3.7
conda activate newsmtsc

FYI, for users of virtualenv, the equivalent command would be:

virtualenv -ppython3.7 --setuptools 45 venv
source venv/bin/activate

2. Install NewsSentiment:

pip3 install NewsSentiment        # without cuda support
pip3 install NewsSentiment[cuda]  # with cuda support

You're all set now, all required models will automatically download on-demand :-)

Target-dependent Sentiment Classification

Please note that running infer.py (or its first import) and the first run of TargetSentimentClassifier can take some time depending on your internet connection speed. NewsSentiment downloads and loads the required models during this time.

Target-dependent sentiment classification works out-of-the-box. Have a look at infer.py or give it a try:

python3.7 infer.py

Training

If you want to train one of our models or your own model, please clone the repository first.

git clone git@github.com:fhamborg/NewsMTSC.git

There are two entry points to the system. train.py is used to train and evaluate a specific model on a specific dataset using specific hyperparameters. We call a single run an experiment. controller.py is used to run multiple experiments automatically. This is for example useful for model selection and evaluating hundreds or thousands of combinations of models, hyperparameters, and datasets.

Running a single experiment

Goal: training a model with a user-defined (hyper)parameter combination.

train.py allows fine-grained control over the training and evaluation process, yet for most command line arguments we provide useful defaults. Two arguments are required:

--own_model_name (which model is used, e.g., grutsc),
--dataset_name (which dataset is used, e.g., newsmtsc-rw).

For more information refer to train.py and combinations_absadata_0.py. If you just want to get started quickly, the command below should work out of the box.

python train.py --own_model_name grutsc --dataset_name newsmtsc-rw

Running multiple experiments

Goal: finding the (hyper)parameter combination to train a model that achieves the best performance.

controller.py takes a set of values for each argument, creates combinations of arguments, applies conditions to remove unnecessary combinations (e.g., some arguments may only be used for a specific model), and creates a multiprocessing pool to run experiments of these argument combinations in parallel. After completion, controller.py creates a summary, which contains detailed results, including evaluation performance, of all experiments. By using createoverview.py, you can export this summary into an Excel spreadsheet.

Acknowledgements

This repository is in part based on ABSA-PyTorch. We thank Song et al. for making their excellent repository open source.

How to cite

If you use the dataset or model, please cite our paper (PDF):

@InProceedings{Hamborg2021b,
  author    = {Hamborg, Felix and Donnay, Karsten},
  title     = {NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles},
  booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)},
  year      = {2021},
  month     = {Apr.},
  location  = {Virtual Event},
}

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.idea		.idea
NewsSentiment		NewsSentiment
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
READMEpypi.md		READMEpypi.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
pythoninfo.md		pythoninfo.md
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

Installation

Target-dependent Sentiment Classification

Training

Running a single experiment

Running multiple experiments

Acknowledgements

How to cite

About

Releases

Packages

Languages

License

LetsData-net/NewsMTSC

Folders and files

Latest commit

History

Repository files navigation

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

Installation

Target-dependent Sentiment Classification

Training

Running a single experiment

Running multiple experiments

Acknowledgements

How to cite

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages