Fake or Real Tweets - BERT, LSTM, TF-IDF

The dataset includes tweets about disasters, e.g., earthquake, wildfire. The objective is to detect if the tweet is about a real disaster vs. fake disaster. Different approaches have been performed for data cleaning and training the model. The best model can predict real vs. fake tweets with 89% accuracy using transfer learning (BERT).

The following models have been developed for training:

BOW Model with Logistic Regression. (accuracy 77%)
Tf-Idf with Logistic Regression. (accuracy 78%)
Word2Vec with LSTM. (accuracy 80%)
BERT [BERT-large 24 layer, 1024 hidden, 16 heads] (accuracy 89%)

This Project’s GitHub Repository

Classification
RNN
Transfer Learning
NLP
Embedding