Profanity Classifier
Thanks to the advancement of technology and emergence of social media, everyone’s message can be broadcasted to the world with a push of button. This double-edged sword can be used to empower people or to bully them, to spread awareness or hate.
The aim of this project was to suppress hate, and make online world a safer place for children and adults.
The result is a Machine Learning model, which uses Recurrent Neural Networks to detect if a piece of text contains some type of negative sentiment or hate. It can be used to prevent a swearword from being posted on social media, or to bleep censor an audio in real-time, on a live broadcast.
Since this model analyses inputs based on a sub-word character n-grams, even variations of swearwords are detected and flagged as inappropriate. What do I mean by variations? Try inputting the following phrases in the text field below:
• You’re an @ss-hole
• You’re f**king annoying
Date
Sep 2018 - May 2019
Human hours
263
What profanity classifier involved (stack)
Try it for yourself!
Profanity Classifier
Model achieved 97.90% accuracy against a dataset of 100k unseen records.
Trained the model using Google Tensorflow and Keras library
Used Fasttext word embeddings to evaluate text on a sub-word character n-grams basis.
Conversion of the python model using Tensorflowjs to make it available on the browser. (demo above)