We present O-Norm (Offense Normalizer), a system for normalizing offensive words and phrases by reversing user-created obfuscations. O-Norm is intended to be a preprocessing tool for pipelines that run on user-created text, particularly those tasks where obscenities may be of significant value such as sentiment, toxicity, and abuse detection. O-Norm is constructed with a purely generated, context-free dataset derived from a curse dictionary. The generative dataset allows O-Norm to be flexible to the addition of new words as language adapts, and also allows it to be readily retrained in other Latin alphabet-based languages as it does not require manual annotations to train. O-Norm is based on a character-level, transformer network which attempts de-obfuscation only on out-of-vocabulary (OOV) tokens. In an 80/20 train-test split, O-Norm achieves an F1 score of 89.6% over 141 curses in a generated dataset with 2.16 million unique training points. An inspection of O-Norm’s output on a sample of social media posts from Kaggle’s Jigsaw corpus reveals accuracy of 95.7% on de-obfuscating transformations in toxic user-created text.
Article ID: 2021L16
Month: May
Year: 2021
Address: Online
Venue: Canadian Conference on Artificial Intelligence
Publisher: Canadian Artificial Intelligence Association
URL: https://caiac.pubpub.org/pub/5uqi2h7k/