Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
[Read the Paper] | [Take a Survey to Access the Data] | [Download the Data]
It is important to consider the subtle tricks that many extremists use to mask their threats and abuse. These more implicit forms of hate speech may easily go undetected by keyword detection systems, and even the most advanced architectures can fail if they have not been trained on implicit hate speech (Caselli et al. 2020).
If you have not already, please first complete a short survey. Then follow this link to download (2 MB, expands to 6 MB).
This dataset contains 22,056 tweets from the most prominent extremist groups in the United States; 6,346 of these tweets contain implicit hate speech. We decompose the implicit hate class using the following taxonomy (distribution shown on the left).
- (24.2%) Grievance: frustration over a minority group's perceived privilege.
- (20.0%) Incitement: implicitly promoting known hate groups and ideologies (e.g. by flaunting in-group power).
- (13.6%) Inferiority: implying some group or person is of lesser value than another.
- (12.6%) Irony: using sarcasm, humor, and satire to demean someone.
- (17.9%) Stereotypes: associating a group with negative attribute using euphemisms, circumlocution, or metaphorical language.
- (10.5%) Threats: making an indirect commitment to attack someone's body, well-being, reputation, liberty, etc.
- (1.2%) Other
Each of the 6,346 implicit hate tweets also has free-text annotations for target demographic group and an implied statement to describe the underlying message (see banner image above).
State-of-the-art neural models may be able to learn from our data how to (1) classify this more difficult class of hate speech and (3) explain implicit hate by generating descriptions of both the target and the implied message. As our paper baselines show, neural models still have a ways to go, especially with classifying implicit hate categories, but overall, the results are promising, especially with implied statement generation, an admittedly challenging task.
We hope you can extend our baselines and further our efforts to understand and address some of these most pernicious forms of language that plague the web, especially among extremist groups.
Citation:
ElSherief, M., Ziems, C., Muchlinski, D., Anupindi, V., Seybolt, J., De Choudhury, M., & Yang, D. (2021). Latent Hatred: A Benchmark for Understanding Implicit Hate Speech. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).
BibTeX:
@inproceedings{elsherief-etal-2021-latent,
title = "Latent Hatred: A Benchmark for Understanding Implicit Hate Speech",
author = "ElSherief, Mai and
Ziems, Caleb and
Muchlinski, David and
Anupindi, Vaishnavi and
Seybolt, Jordyn and
De Choudhury, Munmun and
Yang, Diyi",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.29",
pages = "345--363"
}