Profanity Classifier

Thanks to the advancement of technology and emergence of social media, everyone’s message can be broadcasted to the world with a push of button. This double-edged sword can be used to empower people or to bully them, to spread awareness or hate.

The aim of this project was to suppress hate, and make online world a safer place for children and adults.

The result is a Machine Learning model, which uses Recurrent Neural Networks to detect if a piece of text contains some type of negative sentiment or hate. It can be used to prevent a swearword from being posted on social media, or to bleep censor an audio in real-time, on a live broadcast.

Since this model analyses inputs based on a sub-word character n-grams, even variations of swearwords are detected and flagged as inappropriate. What do I mean by variations? Try inputting the following phrases in the text field below:

• You’re an @ss-hole

• You’re f**king annoying

Date

Sep 2018 - May 2019

Human hours

263

What profanity classifier involved (stack)

tensorflow
tensorflow.js
Long Short-Term Memory (LSTM)
keras
Recurrent Neural Networks
Word Embeddings
pandas
matplotlib

Try it for yourself!

You need to get a bit indecent when testing this!
Truly sh ite
0.706407
0.243457
0.218749
0.024082
0.048196
I can never get my head round why people take a perfectly functional and user friendly App and then destroy it with a supposed update. You can't even top up from the dam thing anymore, what's the point??!! Crap crap crap EE you should be ashamed.
0.926028
0.537681
0.758657
0.010513
0.061627
This app is fu_cking dreadful, can't login , keeps sending me into a loop of entering my phone number. EE your a disgrace, resent paying you anything
0.991210
0.757937
0.968736
0.026089
0.095399
So... ....slow
0.027596
0.006523
0.008584
0.002531
0.002192
You dirty lying bastards claiming apple music is free and it take my data so now ive had to purchase more
0.983446
0.761073
0.668111
0.035630
0.134526
View 2 more examples

Profanity Classifier

Filtering negativity throughout the internet

Model achieved 97.90% accuracy against a dataset of 100k unseen records.

Trained the model using Google Tensorflow and Keras library

Used Fasttext word embeddings to evaluate text on a sub-word character n-grams basis.

Conversion of the python model using Tensorflowjs to make it available on the browser. (demo above)