The challenge of automatic moderation of harmful comments online has been the subject of a lot of research recently, but the focus has been mostly on detecting it in individual messages after they have been posted. Some authors have tried to predict if a conversation will derail into harmfulness using the features of the first few messages. In this paper, we combine that approach with previous work on harmful message detection using sentiment information, and show how the sentiments expressed in the first messages of a conversation can help predict upcoming harmful messages. Our results show that adding sentiment features does help improve the accuracy of harmful message prediction, and allow us to make important observations on the general task of preemptive harmfulness detection.
Article ID: 2021L01
Publisher: Canadian Artificial Intelligence Association